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Preface 



0.1 What is meant by "Planning Algorithms"? 

Due to many exciting developments in the fields of robotics, artificial intelligence, 
and control theory, three topics that were once quite distinct are presently on a 
collision course. In robotics, motion planning was originally concerned with prob- 
lems such as how to move a piano from one room to another in a house without 
hitting anything. The field has grown, however, to include complications such as 
uncertainties, multiple bodies, and dynamics. In artificial intelligence, planning 
originally meant a search for a sequence of logical operators or actions that trans- 
form an initial world state into a desired goal state. Presently, planning extends 
beyond this to include many decision-theoretic ideas such as Markov decision 
processes, imperfect state information, and game-theoretic equilibria. Although 
control theory has traditionally been concerned with issues such as stability, feed- 
back, and optimality, there has been a growing interest in designing algorithms 
that find feasible open- loop trajectories for nonlinear systems. In some of this 
work, the term motion planning has been applied, with a different interpretation 
of its use in robotics. Thus, even though each originally considered different prob- 
lems, the fields of robotics, artificial intelligence, and control theory have expanded 
their scope to share an interesting common ground. 

In this text, I use the term planning in a broad sense that encompasses this 
common ground. This does not, however, imply that the term is meant to cover 
everything important in the fields of robotics, artificial intelligence, and control 
theory. The presentation is focused primarily on algorithm issues relating to plan- 
ning. Within robotics, the focus is on designing algorithms that generate useful 
motions by processing complicated geometric models. Within artificial intelli- 
gence, the focus is on designing systems that use decision-theoretic models com- 
pute appropriate actions. Within control theory, the focus of the presentation 
is on algorithms that numerically compute feasible trajectories or even optimal 
feedback control laws. This means that analytical techniques, which account for 
the majority of control theory literature, are not addressed here. 

The phrase "planning and control" is often used to identify complementary is- 
sues in developing a system. Planning is often considered as a higher-level process 
than control. In this text, we make no such distinctions. Ignoring old connotations 
that come with the terms, "planning" or "control" could be used interchangeably. 
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Both can refer to some kind of decision making in this text, with no associated 
notion of "high" or "low" level. A hierarchical planning (or control!) strategy 
could be developed in any case. 

0.2 Who is the Intended Audience? 

The text is written primarily for computer science and engineering students at 
the advanced undergraduate or beginning graduate level. It is also intended as 
an introduction to recent techniques for researchers and developers in robotics 
and artificial intelligence. It is expected that the presentation here would be 
of interest to those working in other areas such as computational biology (drug 
design, protein folding), virtual prototyping, and computer graphics. 

I have attempted to make the book as self-contained and readable as possible. 
Advanced mathematical concepts (beyond concepts typically learned by under- 
graduates in computer science and engineering) are introduced and explained. 
For readers with deeper mathematical interests, directions for further study are 
given at the end of some chapters. 

0.3 Suggested Use 

The ideas should flow naturally from chapter to chapter, but at the same time, 
the text has been designed to make it easy to skip chapters. 

If you are only interested in robot motion planning, it is only necessary to read 
Chapters 3-8, possibly with the inclusion of some discrete planning algorithms 
from Chapter 2 because they arise in motion planning. Chapters 3 and 4 provide 
the foundations needed to understand basic robot motion planning. Chapters 5 
and 6 present algorithmic techniques to solve this problem. Chapters 7 and 8 
consider extensions of the basic problem. If you are additionally interested in 
nonholonomic planning and other problems that involve differential constraints, 
then it is safe to jump ahead to Chapters 13-15, after completing Chapters 3-7. 

Chapters 11 and 12 cover problems in which there is sensing uncertainty. These 
problems live in an information space, which is detailed in Chapter 11. Chapter 
12 covers algorithms that plan in the information space. 

If you are mainly interested in decision-theoretic planning, then you can read 
Chapter 2, and jump straight to Chapters 9-12. The material in these later 
chapters does not depend much on Chapters 3 to 8, which cover motion planning. 
Thus, if you are not interested in this case, the chapters may be easily skipped. 

0.4 Acknowledgments 

I am very grateful to many students and colleagues who have given me extensive 
feedback and advice in developing this text. 
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Many thanks go to Stefano Carpin, Sanjit Jhala, Stephane Redon, Sanketh 
Shetty, Mohan Sirchabesan, and Zbynek Winkler for pointing out mistakes in the 
on-line manuscript. 

I also appreciate the efforts of graduate students in my courses who scribed 
class notes which served as an early draft for some parts. These include students 
at Iowa State: Brian George, and students at the University of Illinois: Shamsi 
Tamara Iqbal, Rishi Talreja, Sherwin Tarn, Benjamin Tovar ... 

I am also thankful for the supportive environments provided both by Iowa 
State University and the University of Illinois. In both universities, I have been 
able to develop courses for which the material presented here has been developed 
and refined. 

I sincerely thank Krzysztof Kozlowski and his staff at the Politechnika Poz- 
nanska for their help during my sabbatical in Poland. 

0.5 Help! 

Since the text appears on the web, it is easy for me to incoprorate feedback 
from readers. This will be very helpful as I complete this project. If you find 
mistakes, have requests for coverage of topics, find some explanations difficult 
to follow, have suggested exercises, etc., please let me know by sending mail to 
lavalle@cs.uiuc.edu. Note that this book is current a work in progress. Please be 
patient as I update parts over the coming year or two. 
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i 



Chapter 1 
Introduction 



Chapter Status 




What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



1.1 Planning to Plan 



Planning is a term that means different things to different groups of people. A 
fundamental need in robotics is to have algorithms that can automatically tell 
robots how to move when they are given high-level commands. The terms motion 
planning and trajectory planning are often used for these kinds of problems. A 
classical version of motion planning is sometimes referred to as the Piano Mover's 
Problem. Imagine giving a precise 3D CAD model of a house and a piano as input 
to an algorithm. The algorithm must determine how to move the piano from one 
room to another in the house without hitting anything. Most of us have encoun- 
tered similar problems when moving a sofa or mattress up a set of stairs. Robot 
motion planning usually ignores dynamics and other differential constraints, and 
focuses primarily on the translations and rotations required to move the piano. 
Recent work, however, does consider other aspects, such as uncertainties, differ- 
ential constraints, modeling uncertainties, and optimality. Trajectory planning 
usually refers to the problem of taking the solution from a robot motion planning 
algorithm and determining how to move along the solution in a way that respects 
the mechanical limitations of the robot. 

Control theory has historically been concerned with designing inputs to sys- 
tems described by differential equations. These could include mechanical systems 
such as cars or aircraft, electrical systems such as noise filters, or even systems 
arising in areas as diverse as chemistry, economics, and sociology. Classically, 
control theory has developed feedback policies, which enable an adaptive response 
during execution, and has focused on stability, which ensures that the dynamics 
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do not cause the system to become wildly out of control. A large emphasis is also 
placed on optimizing criteria to minimize resource consumption, such as energy 
or time. In recent control theory literature, motion planning sometimes refers to 
the construction of inputs to a nonlinear dynamical system that drives it from an 
initial state to a specified goal state. For example, imagine trying to operate a 
remote-controlled hovercraft that glides over the surface of a frozen pond. Sup- 
pose we would like the hovercraft to leave its current resting location and come to 
rest at another specified location. Can an algorithm be designed that computes 
the desired inputs, even in an ideal simulator that neglects uncertainties that arise 
from model inaccuracies? It is possible to add other considerations, such as uncer- 
tainties, feedback, and optimality, but the problem is already challenging enough 
without these. 

In artificial intelligence, the term AI planning takes on a more discrete flavor. 
Instead of moving a piano through a continuous space, as in the robot motion 
planning problem, the task might be to solve a puzzle, such as the Rubik's cube or 
a sliding tile puzzle. Although such problems could be modeled with continuous 
spaces, it seems natural to define a finite set of actions that can be applied to 
a discrete set of states, and to construct a solution by giving the appropriate 
sequence of actions. Historically, planning has been considered different from 
problem solving; however, the distinction seems to have faded away in recent 
years. In this book, we do not attempt to make a distinction between the two. 
Also, substantial effort has been devoted to representation language issues in 
planning. Although some of this will be covered, it is mainly outside of our 
focus. Many decision-theoretic ideas have recently been incorporated into the AI 
planning problem, to model uncertainties, adversarial scenarios, and optimization. 
These issues are important, and are considered here in detail. 

Given the broad range of problems to which the term planning has been ap- 
plied in the artificial intelligence, control theory, and robotics communities, one 
might wonder whether it has a specific meaning. Otherwise, just about anything 
could be considered as an instance of planning. Some common elements for plan- 
ning problems will be discussed shortly, but first we consider planning as a branch 
of algorithms. Hence, this book is entitled Planning Algorithms. The primary 
focus is on algorithmic and computational issues of planning problems that have 
arisen in several disciplines. On the other hand, this does not mean that planning 
algorithms refers to an existing community of researchers within the general algo- 
rithms community. This book will not be limited to combinatorics and asymptotic 
complexity analysis, which is the main focus in pure algorithms. The focus here 
includes numerous modeling considerations and concepts that are not necessarily 
algorithmic, but aid in solving or analyzing planning problems. 

The obvious goal of virtually any planning algorithm is to produce a plan. 
Natural questions are: What is a plan? How is a plan represented? What is it 
supposed to achieve? How will its quality be evaluated? Who or what is going to 
use it? Regarding the user of the plan, it obviously depends on the application. 
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In most applications, an algorithm will execute the plan; however, sometimes the 
user may be a human. Imagine, for example, that the planning algorithm provides 
you with an investment strategy. A generic term that will used frequently here 
to refer to the user is decision maker. In robotics, the decision maker is simply 
referred to as a robot. In artificial intelligence and related areas, it has become 
popular in recent years to use the term agent, possibly with adjectives to make 
intelligent agent or software agent. Control theory usually refers to the decision 
maker as a system or plant. The plan in this context is sometimes referred to as 
a policy or control law. In a game-theoretic context, it might make sense to refer 
to decision makers as players. Regardless of the terminology used in a particular 
discipline, this book is concerned with planning algorithms that find a strategy for 
one or more decision makers. Therefore, it is important to remember that terms 
like "robot", "agent", and "system" are interchangeable. 

1.2 Illustrative Problems 

This section only has a couple of pasted examples. It still needs to be written, to 
include other examples from discrete planning, information spaces, game theory, 
etc. More examples will be added gradually as other parts of the book are written. 

Suppose that we have a tiny mobile robot that can move along the floor in a 
building. The task is to determine a path that it should follow from a starting 
location to a goal location, while avoiding collisions. A reasonable model can be 
formulated by assuming that the robot is a moving point in a two-dimensional 
environment that contains obstacles. 

Let W = M 2 denote a two-dimensional world which contains a point robot, 
denoted by A. A subset, O, of the world is called the obstacle region. Let the 
remaining portion of the world, W \ O be referred to as the free space. The 
task is to design an algorithm that accepts an obstacle region defined by a set 
of polygons, an initial position, and a goal position. The algorithm must return 
a path that will bring the robot from the initial position to the goal position, 
while only traversing the free space. Algorithms that find exact solutions to this 
problem are given in Section 6.2. 

Figures 1.2 and 1.3 show considerably more challenging problems. 

1.3 Basic Ingredients of Planning 

Although the subject of this book spans a broad class of models and problems, 
there are several basic ingredients that arise throughout virtually all of the topics 
covered as part of planning. 

State: Planning problems will involve a state space that captures all possible 
situations that could exist. The state could, for example, represent the 
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Figure 1.1: A simple illustration of the two dimensional path planning problem: 
a) The obstacles, initial position, and goal positions are specified as input; b) A 
path planning algorithm will compute a collision free path from the initial position 
to the goal position. 

configuration of a robot, the locations of tiles in a puzzle, or the position 
and velocity of a helicopter. Both discrete (finite, or countably infinite) 
and continuous (uncountably infinite) state spaces will be allowed. One 
recurring theme through most of planning is that the state space will usually 
be represented implicitly by a planning algorithm. In most applications, 
the size of the state space (in terms of number of states or combinatorial 
complexity) is much too large to be explicitly represented. Nevertheless, the 
definition of the state space is an important component in the formulation 
of a planning problem, and in the design and analysis of algorithms that 
solve it. 

Time: All planning problems involve a sequence of decisions that must be 
applied over time. Time might be explicitly modeled, as in a problem such as 
driving a car as quickly as possible through an obstacle course. Alternatively, 
time may be implicit, by simply reflecting the fact that actions must follow 
in succession, as in the case of solving the Rubik's cube. The particular 
time is unimportant, but the proper sequence must be maintained. Another 
example is a solution to the Piano Mover's Problem; the solution to moving 
the piano may be converted into an animation over time, but the particular 
speed of motions is not specified in the planning problem. Just as in the 
case of state, time may be either discrete or continuous. In the latter case, 
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Figure 1.2: Remember puzzles like this? Imagine trying to solve one with an 
algorithm. The goal is to pull the two bars apart. This example is called the 
Alpha 1.0 Puzzle. It was created by Boris Yamrom, GE Corporate Research & 
Development Center, and posted as a research benchmark by Nancy Amato at 
Texas A&M University. This animation was made by James Kuffner, of Carnegie 
Mellon University. The solution was computed by the balanced, bidirectional 
RRT algorithm, developed by James Kuffner and Steve LaValle, and covered in 
Section 5.5 
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Figure 1.3: Using robots to move a piano [177]. This solution was computed using 
planning techniques developed by Juan Cortes, Thierry Simeon, and Jean-Paul 
Laumond, and are covered in Section 7.4. 



1.3. BASIC INGREDIENTS OF PLANNING 



9 



we can imagine that a continuum of decisions are being made by a plan. 

Actions: A plan generates actions that manipulate the state. The terms 
actions and operators are common in artificial intelligence; in control theory 
and robotics, the equivalent terms are inputs or controls. Somewhere in 
the planning formulation, it must be specified how the state changes when 
actions are applied. This may be expressed as an state-valued function 
for the case of discrete time, or as an ordinary differential equation for 
continuous time. For most motion planning problems, explicit reference to 
time is avoided by designing paths through a continuous state space. Such 
paths may be expressed as the integral of differential equations, but it is 
an unnecessary complication in this case. For some problems uncontrollable 
actions could be chosen by nature, which interfere with the outcome, but are 
not specified as part of the plan. This enables various forms of uncertainty 
to be introduced into the planning problem. 

Initial and goal states: Planning generally involves starting in some initial 
state and trying to arrive at a specified goal state. The actions are selected 
in a way that causes this to happen. 

A criterion: This encodes the desired outcome in terms of state and ac- 
tions that are executed. There are generally two different kinds of planning 
concerns based on the type of criterion: 

1. Feasibility: In this case, the only concern is whether the plan results 
in arriving at a goal state. 

2. Optimality: Find feasible plans that optimize performance in some 
carefully specified manner, in addition to arriving in a goal state. 

For most of the problems considered in this book, feasibility is already chal- 
lenging enough; achieving optimality is considerably harder for most prob- 
lems. Therefore, a substantial amount of focus is on finding feasible solutions 
to problems, as opposed to optimal solutions. The majority of literature in 
robotics, control theory, and related fields focuses on optimality, but this 
is not necessarily important for many problems of interest. In many ap- 
plications, it is difficult to even formulate the right criterion to optimize. 
Even if a desirable criterion can be formulated, it may be impossible to 
obtain a practical algorithm that computes optimal plans. In such cases, 
feasible solutions are certainly preferable to having no solutions at all. For- 
tunately, for many algorithms, such as those developed in motion planning, 
the solutions produced are usually not too far from optimal in practice. 
This reduces the amount of motivation for finding optimal solutions. For 
problems that involve probabilistic uncertainty, however, optimization arises 
more frequently. The probabilities are often utilized to obtain the best per- 



10 



S. M. LaValle: Planning Algorithms 



formance in terms of expected costs. Feasibility is usually associated with 
performing worst-case analysis of uncertainties. 

A plan: In general, a plan will impose a specific strategy or behavior on 
decision makers. A plan might simply specify a sequence of actions to be 
taken; however, it may be more complicated. If it is impossible to predict 
future states, the plan may provide actions as a function of state. In this 
case, regardless of future states, the appropriate action is determined. Using 
terminology from other fields, this enables feedback or reactive plans. It 
might even be the case that the state cannot be measured. In this case, the 
action must be chosen based on whatever information is available up to the 
current time. This will generally be referred to as an information state, on 
which a plan will be conditioned. 
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Figure 1.4: According to the Church- Turing thesis, the notion of an algorithm is 
equivalent to the notion of a Turing machine. 



What is a planning algorithm? This is a difficult question, which is difficult 
to completely answer in this section without formally introducing the planning 
concepts that appear in later chapters. One point needs to be made clear at 
this point: the classical Turing machine model used to define an algorithm in 
theoretical computer science is insufficient to encompass planning algorithms. A 
Turing machine is a finite state machine with a special head that can read and 
write along an infinite piece of tape, as depicted in Figure 1.4. The Church- Turing 
thesis states that an algorithm is a Turing machine (see [339, 711] for more details). 
The input to the algorithm is encoded as a string of symbols, usually a binary 
string, and then is written to the tape. The Turing machine reads the string, 
performs computations and then decides whether to accept or reject the string. 
This version of the Turing machine only solves decision problems; however, there 
are straightforward extends that can yield other desired outputs, such as a plan. 

The trouble with using the Turing machine as a model for planning algorithms 
is that plans will be generally assumed to interact with a physical world, as de- 
picted in Figure 1.5. This is fundamental to robotics and many other fields in 
which planning is used. Using the Turing machine as a foundation for algorithms 
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Environment 



Figure 1.5: a) The boundary between machine and environment is considered as 
an arbitrary line that may be drawn in many ways depending on the context, b) 
Once the boundary has been drawn, it is assumed that the machine interacts with 
the environment through sensing and actuation. 



usually implies that the physical world must be first carefully modeled and writ- 
ten on the tape before the algorithm can make decisions. If changes occur in the 
world during execution of the algorithm, then it is not clear what should happen. 
For example, a mobile robot could be moving in a cluttered environment in which 
people are walking around. The robot might throw an object onto a table with- 
out being able to precisely predict how the object will come to rest. It can take 
measurements of the results with sensors, but it again becomes a difficult task to 
determine how much should be explicitly modeled and written on the tape. The 
on-line algorithm model is more appropriate for these kind of problems []; how- 
ever, it still does not capture a notion of planning algorithms that is sufficiently 
broad for the topics of this book. 
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Infinite Row of Switches 

Figure 1.6: A robot could be used to similate a Turing machine. Through manip- 
ulation, many other kinds of behavior could be obtained that fall outside of the 
Turing model. 



The processes that can occur in a physical world are more complicated than 
the interaction between a state machine and a piece of tape filled with symbols. 
It is even possible to simulate the tape by imagining a robot that interacts with 
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Figure 1.7: A classical model that has been used for decades in robotics. 



a long row of switches as depicted in Figure 1.6. The switches serve the same 
purpose as the tape, and the robot carries a computer that can simulate the finite 
state machine. 1 The complicated interaction allowed between a robot and its 
environment could give rise to many other models of computation. A discussion 
of performing computations with mechanical systems is given in [?]. 

In general, the physical world will be referred to as the environment. The de- 
vice that implements a plan will be referred to as the machine. Practical examples 
of the machine include a robot, a piece of software, or even specialized hardware 
which may be digital or analog. As indicated in Figure 1.5, the boundary between 
the machine and the environment is an arbitrary line that varies from problem to 
problem. Once drawn, sensors provide information about the environment which 
serves as input to the machine during execution. The machine then executes ac- 
tions, which provides actuation to the environment. The actuation may alter the 
environment is some way that is later measured by sensors. Therefore, there is 
close coupling between the machine and its environment during execution. 

It is even possible to draw the line between machine and environment in multi- 
ple places, which results in a hierarchical approach. The environment with respect 
to a machine, Mi, might actually include another machine M 2 that interacts with 
its environment, as depicted in Figure ??. Figure ?? shows a typical hierarchy 
used for years in robotics. In general, any number of planning layers may be de- 
fined. For the design of planning algorithms, reference will usually only be made 
to a single layer. If the models are formulated correctly for each layer, and if each 
designed plan functions correctly, then the global hierarchy should solve tasks as 
desired. There are many interesting issues involving the construction of such hier- 
archies, but these will not be addressed in this book because they depend heavily 
on the particular context in which planning is used. Determining the appropriate 
places to draw boundaries and modularize a complicated problem is mostly the 
burden of the expert who applies planning techniques in a particular context. 

Once the boundary has been drawn between the machine and its environment, 
a third component can be introduced: the planner. The task of the planner is 
to produce a plan based on a description of possible environments. There are 
two general models for plans constructed by the planner. The first case is de- 
picted in Figure 1.8, in which the planner produces a plan, which is encoded in 
some way and given as input to the machine. In this case, the machine is consid- 
ered programmable, and can accept possible plans from planner before execution. 

1 Of course, simulating infinitely- long tape seems impossible in the physical world. Other 
versions of Turing machines exist in which the tape is finite, but unbounded. This may be more 
appropriate for the current discussion. 
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Figure 1.8: A planner produces a plan that may be executed by the machine. The 
planner may either be a machine itself or even a human. 
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Figure 1.9: Alternatively, the planner may design the entire machine. 
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It will generally be assumed that once the plan is given, the machine becomes 
autonomous and can no longer interact with the planner. 2 

The second general model for plans is depicted in Figure 1.9. In this case, 
the plan produced by the planner encodes an entire machine. The plan can be 
considered as a special-purpose machine that is designed to solve the specific tasks 
given originally to the planner. Under this interpretation, it may be preferable to 
be minimalist and design the simplest machine possible that is sufficiently solves 
the desired tasks. 

There are two possible ways to implement the planner. The planner may 
either be an algorithm in the Turing machine sense (or some related variant), or 
the planner may even be a human. For example, it is perfectly acceptable for a 
human to design a state machine that is connected to the environment. There 
are no additional inputs in this case because the human fulfilled the role of the 
traditional algorithm. The environment model is given as input to the human, and 
the human "computes" a plan. An example in which the planner is a traditional 
algorithm is given in robotics by classical motion planning. An algorithm receives 
a description of the environment in terms of geometric models and them computes 
a plan, which is a collision-free path to be followed by the robot. Whether the 
planner is a human or is a machine itself, the process of developing plans will be 
generally referred to as planning algorithms. 

To summarize, there are three general components: 

1. The environment, which models the physical world with which a plan must 
interact. 

2. The machine, which interacts with the environment through sensing and 
actuation. The machine may be programmable, which means a plan can be 
downloaded, or the machine may simply be the plan itself. 

3. The planner, which takes one of a set of environment descriptions and pro- 
duces a plan. In some cases, the human designs the planner, and in others, 
the human is the planner. In both cases, these will be referred to as planning 
algorithms. 

1.5 Organization of the Book 

PART I: Introductory Material 

This provides very basic background for the rest of the book. 

• Chapter 1: Introductory Material 

This includes some examples and provides a high-level overview of planning 
philosophy. 

2 Of course this model can be extended to allow machines to be improved over time be 
receiving better plans; however, we want a strict notion of autonomy for the discussion of 
planning in this book. This model does not prohibit the updating of plans in practice. 
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• Chapter 2: Discrete Planning 

This chapter can be considered as a springboard for entering into the rest 
of the book. From here, you can continue to Part II, or even head straight 
to Part III. Sections 2.2 and 2.3 are most important for heading into Part 
II. For Part III, Section 2.4 is additionally useful. 

PART II: Motion Planning 

The main source of inspiration for the problems and algorithms covered in this 
part comes from robotics. The methods, however, are general enough to apply to 
applications in other areas, such as computational biology, computer-aided design, 
and computer graphics. An alternative title that more appropriately reflects the 
kind of planning that occurs is "Planning in Continuous State Spaces." 

• Chapter 3: Geometric Representations and Transformations 

The chapter gives important background for expressing a motion planning 
problem. Section 3.1 describes how to construct geometric models, and the 
remaining sections indicate how to transform them. Sections 3.1 and 3.2 are 
most important for later chapters. 

• Chapter 4: The Configuration Space 

This chapter introduces concepts from topology and uses them to formulate 
the configuration space, which is the state space that arises in motion plan- 
ning. Sections 4.1, 4.2, and 4.3.1 are most critical for understanding most 
of the material in later chapters. In addition to the previously mentioned 
sections, all of Section 4.3 provides useful background for the combinatorial 
methods of Chapter 6. 

• Chapter 5: Sampling-Based Motion Planning 

This chapter introduces motion planning algorithms that have dominated 
the literature in recent years and have been applied in many applications 
both in and out of robotics. If you understand the basic idea that the 
configuration space represents a continuous state space, most of the concepts 
should be understandable. They even apply to other problems in which 
continuous state spaces emerge, in addition to motion planning and robotics. 

• Chapter 6: Combinatorial Motion Planning 

The algorithms covered in this section are sometimes called exact algorithms. 
They provide complete (i.e., the find a solution if one exists, or report fail- 
ure, otherwise) solutions to motion planing problems. The sampling-based 
algorithms have been more useful in practice, but these are not complete in 
the same sense. 

• Chapter 7: Extensions of Basic Motion Planning 

This chapter introduces many problems and algorithms that are extensions 
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of the methods from Chapters 5 and 6. Most can be followed with basic un- 
derstanding of the material from these chapters. Section 7.4 covers planning 
for closed kinematic chains; this requires an understanding of the additional 
material, which is covered in Section 4.4 

• Chapter 8: Feedback Motion Strategies 

This is a transitional chapter that introduces feedback into the motion plan- 
ning problem, but still does not introduce differential constraints, which is 
deferred until Part IV. The previous chapters of Part II focused on comput- 
ing open loop plans, which means that any errors that might occur during 
execution of the plan are ignored. The plan will be executed as planned. 

PART III: Decision-Theoretic Planning 

An alternative title is "Planning under Uncertainty". Most of the part addresses 
discrete state spaces, which can be studied immediately following Part I. However, 
some sections cover extensions to continuous spaces; to understand these parts, it 
will be helpful to have read some of Part II. 

• Chapter 9: Basic Decision Theory 

The concepts and concepts developed here involve making decisions in a 
single step, but in the face of uncertainty. Therefore, the problems generally 
are not considered planning, and there is no talk of state spaces. This chap- 
ter provides important background for Part III, however, because planning 
under uncertainty can be considered as multi-step decision making. Chapter 
9 covers a single step, which is used as a building block for later planning 
concepts. 

• Chapter 10: Sequential Decision Theory 

This chapter takes the concepts from Chapter 9 and extends them by chain- 
ing together a sequence of basic decision-making problems. Dynamic pro- 
gramming concepts from Section 2.4 become important here. For all of 
the problems in this chapter, it is assumed that the current state is always 
known. All uncertainties that exist are with respect to prediction of future 
states, as opposed to measuring the current state. 

• Chapter 11: The Information Space 

The chapter defines a framework for planning when the current state is not 
known. Information regarding the state is obtained from sensor observa- 
tions and memory of actions that were previously applied. The information 
space serves a similar purpose for problems with sensing uncertainty as the 
configuration space did for motion planning. 

• Chapter 12: Planning in the Information Space 

This chapter covers several planning problems and algorithms that involve 
sensing uncertainty. 
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PART IV: Planning under Differential Constants 

This can be considered as a continuation of Part II. Now there can be both global 
(obstacles) and local (differential) constants on the continuous state spaces that 
arise in motion planning. Dynamical systems are also considered, which yields 
state spaces that include both position and velocity information (this coincides 
with the notion of a state space in control theory or a phase space in physics and 
differential equations). 

• Chapter 13: Differential Models 

This chapter serves as an introduction to Part IV by giving examples of 
differential constraints that arise in practice and explaining how to model 
them in the context of planning. 

• Chapter 14: Nonholonomic System Theory 

This section provides an overview of important theory developed for the con- 
trol of nonlinear systems. The basic characteristic is that the dimension of 
the action space is less than that of the state space, which locally constraints 
the possible motions. This can sometimes be overcome by constructing the 
Control Lie Algebra (CLA) of the system. 

• Chapter 15: Planning Under Differential Constraints 

This covers both sampling-based and exact methods for planning under 
differential constraints. If obstacles are involved, sampling-based methods 
are usually required because the problems are so difficult. Nevertheless, 
many useful and important methods exist for planning under differential 
constants alone. 
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Chapter 2 
Discrete Planning 



Chapter Status 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



2.1 Introduction 

This chapter provides introductory concepts that serve as an entry point into 
other parts of the book. The planning problems considered here are the simplest 
to describe because the state space will be finite in most cases. When it is not 
finite, it will at least be countably infinite (i.e., a unique integer may be assigned 
to every state). Therefore, no geometric models or differential equations will be 
needed to characterize the discrete planning problems. Furthermore, no forms 
of uncertainty will be considered, which avoids complications such as probability 
theory. All models are completely known and predictable. 

There are three main parts to this chapter. Sections 2.2 and 2.3 define and 
present search methods for feasible planning, in which the only concern is to reach 
a goal state. The search methods will be used throughout the book in numerous 
other contexts, including motion planning in continuous state spaces. Follow- 
ing feasible planning, Section 2.4 addresses the more general problem of optimal 
planning. The principle of optimality or dynamic programming (DP) principle [63] 
provides a key insight that greatly reduces the computation effort in many plan- 
ning algogrithms. Therefore it forms that basis of the algorithms in Section 2.4 
and throughout this book. The relationship between Dijkstra's algorithm, which 
is widely known, and more general dynamic programming iterations is discussed. 
Finally, Section 2.5 briefly overviews logic-based representations of planning and 
methods that exploit these representations to construct plans. 
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Although this chapter addresses a form of planning, it may also be sometimes 
referred to as problem solving. Throughout the history of artificial intelligence 
research, the distinction between problem solving and planning has been rather 
elusive. For example, in a current leading textbook [665], two of the eight major 
parts are termed "Problem-solving" and "Planning". The problem solving part 
begins by stating, "Problem solving agents decide what to do by finding sequences 
of actions that lead to desirable states." ([665], p. 59). The planning part begins 
with, "The task of coming up with a sequence of actions that will achieve a 
goal is called planning." ([665], p. 375). The STRIPS system is considered one 
of the first planning algorithms and representations [247], and its name means 
STanford Research Institute Problem Solver. Perhaps the term "planning" carries 
connotations of future time, where as "problem solving" sounds somewhat more 
general. A problem solving task might be to take evidence from a crime scene 
and piece together the actions taken by suspects. It might seem odd to call this 
a "plan" because it occurred in the past. 

Given that there are no clear distinctions between problem solving and plan- 
ning, we will simply refer to both as planning. This also helps to keep with the 
theme of the book. Note, however, that some of the concepts apply to a broader 
set of problems that what is often meant by planning. 



2.2 Definition of Discrete Feasible Planning 

The discrete feasible planning model will be defined using state space models, 
which will appear repeatedly throughout this book. Most of these will be natural 
extensions of the model presented in this section. The basic idea is that each 
distinct situation for the world is called a state, denoted by x, and the set of all 
possible states is called a state space, X. For discrete planning, it will be important 
that this set is countable; in most cases it will be finite. In a given application, 
the state space should be defined carefully so that irrelevant information is not 
encoded into a state (e.g., a planning problem that involves moving a robot in 
France should not encode information about whether or not certain light bulbs are 
on in China). The inclusion of irrelevant information can easily convert a problem 
that is amenable to efficient algorithmic solutions into one that is intractable. 

Refer to the model from Chapter 1: The planner is an algorithm that computes 
a sequence of actions. There is no feedback from the environment. The actions 
are sequenced by the machine. 

The world may be transformed through the application of actions that are 
chosen by the planner. Each action, u, when applied from the current state, x, 
produces a new state, x' as specified by a state transition function, f . Let U(x) 
denote the action space for each state x, which represents the set of all actions that 
could be applied from x. For distinct x, x' E X, U (x) and U (x 1 ) are not necessarily 
disjoint; the same action may be applicable in multiple states. Therefore, it will 
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be convenient to define U as the set of all possible actions over all states: 



As part of the planning problem, a set, Xq C X of goal states is defined. The 
task of a planning algorithm is to determine whether a finite sequence of actions, 
when applied, transforms the world from an initial state xj to some state in X G . 
The model is summarized below: 

Formulation 2.2.1 (Discrete Feasible Planning) 

1. A nonempty state space, X, which is a finite or countably infinite set of 



2. For each state, x G X, a finite action space, U(x). 

3. A state transition function, f, which produces a state, f(x, u) G X, for every 
x G X and u G U(x). 

4. An initial state, Xj G X . 

5. A goal set, X G C X. 

It is often convenient to view Formulation 2.2.1 as a directed graph G(V,E), 
in which V and E denote the sets of vertices and edges, respectively. The set 
of vertices is the state space, V = X. 1 Let e(x,x') denote a directed edge from 
x G X to x' . Such an edge exists in E only if there exists some u G U{x) such 
that x' = f(x, u). 

Example 2.2.1 (Moving on a 2D Grid) Suppose that a robot moves around 
on a grid in which each grid point has coordinates of the form (i, j), in which % and 
j are both integers. The robot takes discrete steps in one of four directions (e.g., 
up, down, left, right), which can increment or decrement one coordinate. The 
motions and corresponding graph are shown in Figure 2.1, which can be imagined 
as stepping from tile to tile, on an infinite tile floor. 

Let X be the set of all integer pairs of the form (i,j), in which i,j G Z. 



Let U = {(0,1), (0,-1), (1,0), (0,-1)}. Let U(x) = U for all x G X. The 



state transition equation is f(x,u) = x + u, in which x G X and u G U are 
treated as two-dimensional vectors for addition. For example, if x = (3,4) and 
u = (0, 1), then f(x,u) = (3,5). Suppose for convenience that the initial state is 
xi = (0,0). Many interesting goal sets are possible. Suppose, for example, that 
Xg = {(100, 100)}. It should be easy for the reader to find a sequence of inputs 
that transforms the world from (0,0) to (100, 100). 

1 Instcad, one may want to make a technical distinction between V and X and define a 
bijection between them because each contains a different kind of entities. 




(2.1) 



states. 
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Figure 2.1: An example problem that involves walking around on an infinite tile 
floor. 

The problem can be made more interesting by shading in some of the square 
tiles to represent obstacles that the robot must move around, as shown in Figure 
2.2. In this case, any tile that is shaded has its corresponding vertex and associ- 
ated edges deleted. An outer boundary can be made to fence in a bounded region 
so that X becomes finite. Very complicated labyrinths can be constructed. ■ 



Example 2.2.2 (Rubik's Cube Puzzle) Many puzzles can be expressed as dis- 
crete planning problems. For example, the Rubik's cube is a puzzle that looks 
like a stack of 3 by 3 by 3 little cubes, which together form a larger cube as shown 
in Figure 2.3. Each face of the larger cube is painted one of six colors. An action 
may be applied to the cube by rotating a 3x3 sheet of cubes by 90 degrees. After 
applying many actions to the Rubik's cube, each face will generally be a jumble 
of colors. The state space is the set of configurations for the cube (rotation of the 
entire cube is irrelevant). For each state there are 12 possible actions. For some 
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Figure 2.2: Interesting planning problems that involve exploring a labyrinth can 
be made by shading in tiles. 
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Figure 2.3: The Rubik's cube and other puzzles make nice examples of discrete 
planning problems. 



arbitrarily chosen configuration of the Rubik's cube, the planning task is to find 
a sequence of actions that returns it to the configuration in which each one of its 
six faces is a single color. ■ 



It is important to note that a planning problem is usually specified without 
explicitly representing the entire graph G. Instead, it is revealed incrementally 
in the planning process. In Example 2.2.1, very little information actually needs 
to be given to specify a graph that is infinite in size. If a planning problem is 
given as input to an algorithm, close attention must be paid to the encoding 
when performing complexity analysis. For a problem in which X is infinite, the 
input length must still be finite. For some interesting classes of problems it may be 
possible to compactly specify a model that is equivalent to Formulation 2.2.1. Such 
representation issues have been the basis of much research in artificial intelligence 
over the past decades as different representation logics have been proposed; see 
Section 2.5. In a sense, these representations can be viewed as input compression 
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schemes. 

Readers experienced in computer engineering might recognize that when X is 
finite, Formulation 2.2.1 appears almost identical to the definition of a finite state 
machine or Mealy /Moore machines. Relating the two models, the actions can be 
interpreted as inputs to the state machine, and the output of the machine simply 
reports its state. Therefore, the feasible planning problem (if X is finite) may be 
interpreted as determining whether there exists a sequence of inputs that makes 
a finite state machine eventually report a desired output. From a planning per- 
spective, it is assumed that the planning algorithm has a complete representation 
of the machine and is able to read its current state at any time. 

Readers experienced with theoretical computer science may observe similar 
connections to a deterministic finite automaton (DFA), which is a special kind 
of finite state machine that reads an input string, and makes a decision about 
whether to accept or reject the string. The input string is just a finite sequence 
of inputs, in the same sense as for a finite state machine. A DFA definition 
includes a set of accept states, which in the planning context, can be renamed to 
the goal set. This makes the feasible planning problem (if X is finite) equivalent 
to determining whether there exists an input string that is accepted by a given 
DFA. Usually, a language is associated with a DFA, which is the set of all strings it 
accepts. DEAs are important in the theory of computation because their languages 
correspond precisely to regular expressions. The planning problem amounts to 
determining whether or not the associated language is empty. In terms of Unix-like 
constructions, this means determining whether there is some match to a regular 
expression. 

Thus, there are several ways to represent and interpret the discrete feasible 
planning problem. Other important representation issues will be discussed in 
Section 2.5, which often to a very compact, implicit encoding of the problem. 
Before reaching these issues, basic planning algorithms are introduced in Section 
2.3, and discrete optimal planning is covered in Section 2.4. 

2.3 Searching for Feasible Plans 

The methods presented in this section are just graph search algorithms, but with 
the understanding that the graph is revealed incrementally through the application 
of actions. The presentation in this section can be considered as graph search 
algorithms from a planning perspective. An important requirement for these or 
any search algorithms is to be systematic. If the graph is finite, this means that 
the algorithm will visit every reachable state, which enables it to correctly declare 
in finite time whether or not a solution exists. To be systematic, the algorithm 
should keep track of states already visited. Otherwise, the search may run forever 
by cycling through the same states. Ensuring that no redundant exploration 
occurs is sufficient to make the search systematic. 

If the graph is infinite, then we are willing to tolerate a weaker definition for 
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Figure 2.4: a) Many search algorithms focus too much on one direction, which 
may prevent them from being systematic on infinite graphs, b) If, for example, 
the search carefully expands in wavefronts, then it becomes systematic. The 
requirement to be systematic is that in the limit as the number of iterations tends 
to infinity, all reachable vertices are reached. 



being systematic. If a solution exists, then the search algorithm still must report 
it in finite time; however, if a solution does not exist, it is fine for the algorithm 
to search forever. This systematic requirement is achieved by ensuring that in the 
limit as the number of search iterations tends to infinity, every reachable vertex 
in the graph is explored. Since the number of vertices is assumed to be countable, 
this must always be possible. 

As an example of this requirement, consider Example 2.2.1 on an infinite tile 
floor with no obstacles. If the search algorithm explores in only one direction, as 
depicted in Figure 2. 4. a, then in the limit most of the space will be left uncovered, 
even though no states are revisited. If instead the search proceeds outward from 
the origin in wavefronts, as depicted in Figure 2.4.b, then it may be systematic. 
In generally, each search algorithm has to be carefully analyzed. A search algor- 
tihm could expand in multiple directions, or even in wavefronts, but still not be 
systematic. If the graph is finite, then it is much simpler: virtually any search 
algorithm is systematic, provided that it marks visited states to avoid revisiting 
the same parts indefinitely. 

2.3.1 General Forward Search 

Figure 2.5 gives a general template of search algorithms, expressed using the state 
space representation. At any point during the search, there will be three kinds of 
states: 
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FORWARD _SEARCH 

1 Q.Insert(xj) 

2 while Q not empty do 

3 x <- Q.GetFirstQ 

4 if x G X G 

5 return SUCCESS 

6 forall u E U(x) 

7 x' <— f(x, u) 

8 if x' not visited 

9 Mark x' as visited 

10 Q.Insert(x') 

11 else 

12 Resolve duplicate x' 

13 return FAILURE 



Figure 2.5: A general template for forward search. 

Unvisited: States that have not been visited yet. Initially, this is every 
state except xj. 

Dead: States that have been visited, and for which every possible next 
state has also been visited. A next state of a; is a state x' for which there 
exists a u G U(x) such that x' = f(x,u). In a sense, these states are dead 
because there is nothing more that they can contribute to the search-there 
are no new leads that could help in finding a feasible plan. Section 2.4.3 
discusses a variant in which dead states can become alive again in an effort 
to obtain optimal plans. 

Alive: States that have been encountered and may have next states that 
have not been visited. These are considered alive. Initially, the only alive 
state is xi. 

The set of alive states is stored in a priority queue, Q, for which a priority 
function must be specified. The only significant difference between various search 
algorithms is the particular function used to sort Q. Many variations will be 
described later, but for the time being, it might be helpful to pick one. Therefore, 
assume for now that Q is a common FIFO (First-In First-Out) queue; whichever 
state has been waiting the longest will be chosen when Q.GetFirstQ is called. 
The rest of the general search algorithm is quite simple. Initially, Q contains the 
initial state, xj. A while loop is then executed, which terminates only when Q 
is empty. This will only occur when the entire graph has been explored without 
finding any goal states, which results in a FAILURE (unless X is infinite, in which 
case the algorithm should never terminate). In each while iteration, the highest- 
ranked element, x, of Q is removed. If x lies in X G) then it reports SUCCESS 
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and terminates. Otherwise, the algorithm tries applying every possible action, 
u G U(x). For each next state, x' = f(x,u), it must determine whether x' is 
being encountered for the first time. If it is unvisited, then it is inserted into 
Q. Otherwise, there is no need to consider it because it must be either dead or 
already in Q. 

The algorithm description in Figure 2.5 omits several details that often become 
important in practice. For example, how efficient is the test whether x G Xq in 
Line 4? This depends, of course, on the size of the state space and on the particular 
representations chosen for x and Xq- At this level, we do not specify a particular 
method because the representations are not given. 

One important detail is that the existing algorithm only indicates whether or 
not a solution exists, but does not seem to produce a plan, which is a sequence 
of actions that achieves the goal. This can be fixed by simply adding another line 
after Line 7 which stores associates with x' its parent, x. If this is performed each 
time, one can simply trace the pointers from the final state to the initial state to 
recover the entire plan. For convenience, one might also store which action was 
taken, in addition to the pointer. 

Lines 8 and 9 are conceptually simple, but how can one tell whether x' has 
been visited? For some problems the G might actually be a tree, which means 
that there are no repeated states. Although this does not occur frequently, it is 
wonderful when it does because there is no need to check whether states have 
been visited. If the states in X all lie on a grid, one can simply make a lookup 
table that can be accessed in constant time to determine whether a state has 
been visited. In general, however, it might be quite difficult because the state x' 
must be compared with every other state in Q, and with all of the dead states. 
If the representation of each state is long, as is sometimes the case, this will be 
very costly. A good hashing scheme or another clever data structure can greatly 
alleviate this cost, but in many applications the computation time will remain 
high. One alternative is to simply allow repeated states, but this could lead to an 
increase in computational cost that far outweighs the benefits. Even if the graph 
is very small, search algorithms could run in time exponential in the size of the 
graph, or they may not even terminate at all, even if G is finite. 

One final detail is that some search algorithms will require a cost to be com- 
puted and associated with every state. It the same state is reached multiple times, 
the cost may have to be updated, which is performed in Line 12, if the particular 
search algorithm requires it. Such costs may be used in some way to sort the 
priority queue, or they may enable the recovery of the plan upon completion of 
the algorithm. Instead of storing pointers, as mentioned previously, the optimal 
cost to return to the initial state could be stored with each state. This cost alone 
is sufficient to determine the action sequence that leads to any state visited state. 
Starting at xi, simply choose the action u G U(x) that produces the lowest-cost 
next state, and continue the process iteratively until G is reached. The costs must 
have a certain monotonicity property, which is obtained by Dijkstra's algorithm 
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and A* search, which will be introduced in Section 2.3.2. 
2.3.2 Particular Forward Search Methods 

This section presents several single-tree search algorithms, each of which is a 
special case of the algorithm in Figure 2.5, obtained by defining a different sorting 
function for Q. Most of these are just classical graph search algorithms. 

Breadth First 

The method given in Section 2.3.1 specifies Q as a FIFO queue, which selects states 
using the first-come, first-serve principle. This causes the search frontier to grow 
uniformly, and is therefore referred to as breadth-first search. All plans that have 
k steps are exhausted before plans with k + 1 steps are investigated. Therefore, 
breadth first guarantees that the first solution found will use the smallest number 
of steps. Upon detection that a state has been revisited, there is no work to do 
in Line 12. Since the search progresses in a series of wavefronts, breadth first 
search is systematic. In fact, it even remains systematic if it does not keep track 
of repeated states (however, it will waste time considering irrelevant cycles). 

The running time breadth first search is 0(|V| + \E\), in which |V| and \E\ 
are the numbers of vertices and edges, respectively, in the graph representation 
of the planning problem. This assumes that all operations, such as determining 
whether a state has been visited, are performed in constant time. In practice, 
these operations will typically require more time, and must be counting as part 
of the algorithm complexity. The running time be expressed in terms of the other 
representations. Recall that \V\ = \X\ is the number of states. If the same actions, 
U, are available from every state, then \E\ = \U\\X\. If action sets U(x±) and 
U{x2) are pairwise disjoint for any Xi,x 2 G X, then \E\ = \U\. 

Depth First 

By making Q a stack (Last-In, First-Out), aggressive exploration is the graph 
occurs, as opposed to the uniform expansion of breadth first search. The resulting 
variant is called depth first search because the search dives quickly into the graph. 
The preference is toward investigating longer plans very early. Although this 
aggressive behavior might seem desirable, note that the particular choice of longer 
plans is arbitrary. Actions are applied in the forall loop in whatever order they 
happen to be defined. Once again, if a state is revisited, there is no work to do 
in Line 12. Depth first search is systematic for finite X, but not for an infinite 
X because it could behave like Figure 2. 4. a. The search could easily focus on 
one "direction" and completely miss large portions of the search space as the 
number of iterations tends to infinity. The running time of depth first search is 
also 0(\V\ + \E\). 
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Dijkstra's Algorithm 

Up to this point, there has been no reason to prefer any action over any other in 
the search. Section 2.4 will formalize optimal discrete planning, and will present 
several algorithms that find optimal plans. Before going into that, we present 
a systematic search algorithm that finds optimal plans because it is also useful 
for finding feasible plans. The result is the well-known Dijkstra's algorithm for 
finding single-source shortest paths in a graph [], which is a special form of dy- 
namic programming. More-general dynamic programming computations appear 
in Section 2.4 and throughout the book. 

Suppose that every edge, e G E, in the graph representation of a discrete plan- 
ning problem, has an associated nonnegative cost /(e), which is the cost to apply 
the action. The cost /(e) could be written using the state space representation as 
l(x,u), indicating that it costs l(x,u) to apply action u from state x. The total 
cost of a plan is just the sum of the edge costs over the path from the initial state 
to a goal state. 

The priority queue, Q, will be sorted according to a function, L* : X — > [0, oo], 
called the optimal cost-to-come or just cost-to-come if it is clearly optimal from 
the context. For each state, x, the value C*(x) will represent the optimal 2 cost to 
reach x from the initial state, xi. This optimal cost is obtained by summing edge 
costs, /(e), over all possible paths from xj to x, and using the path that produces 
the least cumulative cost. 

The cost-to-come is computed incrementally during the execution of the search 
algorithm in Figure 2.5. Initially, C*(x I ) = 0. Each time the state x' is generated, 
a cost is computed as: C(x') = C*(x) + /(e), in which e is the edge from x to x' 
(equivalently, we may write C(x') = L*(x) + l(x,u) ). Here, C(x') represents best 
cost-to-come that is known so far, but we do not write C* because it is not yet 
known whether x' was reached optimally. Because of this, some work is required 
in Line 12. If x' already exists in Q, then it is possible that the newly-discovered 
path to x' is more efficient. If so, then the cost-to-come value C(x') must be 
lowered for x', and Q must be reordered accordingly. 

When does C(x) finally become C*(x) for some state xl Once x is removed 
from Q using Q.GetFirst(), the state becomes dead, and it is known that x 
cannot be reached with lower cost. This can be argued by induction. For the 
initial state, C*(xi) is known, and this serves as the base case. Now assume that 
all dead states have their optimal cost-to-come correctly determined. This means 
that their cost-to-come values can no longer change. For the first element, x, of Q, 
the value must be optimal because any path that has lower total cost would have 
to travel through another state in Q, but these states already have higher cost. 
All paths that pass only through dead states were already considered in producing 
C(x). Once all edges leaving x are explored, then x can be declared as dead, and 
the induction continues. This is not enough detail to constitute a proof; much 



2 As in optimization literature, we will use * to mean optimal. 
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more detailed arguments appear in Section 2.4.3 and [176]. The running time is 
0(| V| lg | V| + \E\), in which |V| and \E\ are the numbers of edges and vertices, 
respectively, in the graph representation of the discrete planning problem. This 
assumes that the priority queue is implemented with a Fibonacci heap, and that 
all other operations, such as determining whether a state has been visited, are 
performed in constant time. If other data structures are used to implement the 
priority queue, then different running times will be obtained. 

A-Star 

The A* (pronounced "ay star" ) search algorithm is a variant of dynamic program- 
ming that tries to reduce the total number of states explored by incorporating 
a heuristic estimate of the cost to get to the goal from a given state. Let C(x) 
denote the cost-to-come from xi to x, and let G(x) denote the cost-to-go from 
x to some state in Xq. Although C*(x) can be computed incrementally by dy- 
namic programming, there is no way to know the true optimal cost-to-go, G*, in 
advance. However, in many applications it is possible to construct a reasonable 
underestimate of this cost. As an example of a typical underestimate, consider 
planning in the labyrinth depicted in Figure 2.2. Suppose that the cost is the 
total number of planning steps. If one state has coordinates and another has 
(i',f), then \i' — i\ + \f — j\ is an underestimate because this is the length of a 
straightforward plan that ignores obstacles. Once obstacles are included, the cost 
can only increase as the robot tries to get around them (which may not even be 
possible). Of course, zero could also serve as an underestimate, but that will not 
provide any helpful information to the algorithm. The aim is to compute an esti- 
mate that is as close as possible to the optimal cost-to-go, and is also guaranteed 
to be no greater. Let G*(x) denote such an estimate. 

The A* search algorithm works in exactly the same way as Dijktra's algorithm. 
The only difference is the function used to sort Q. In the A* algorithm, the sum 
C*(x') + G*(x f ) is used, implying that the priority queue is sorted by estimates 
of the optimal cost from xj to Xq- If G*(x) is an underestimate of the true 
optimal cost-to-go for all x G X, the A* algorithm is guaranteed to find optimal 
plans [247, 622]. As G* becomes closer to G*, fewer nodes tend to be explored in 
comparison with dynamic programming. This would always seem advantageous, 
but in some problems it is not possible to find a good heuristic. Note that when 
G*(x) = for all x G X, then A* degenerates to Dijkstra's algorithm. In any 
case, the search will always be systematic. 

Best First 

For best first search, the priority queue is sorted according to an estimate of 
the optimal cost-to-go. The solutions obtained in this way are not necessarily 
optimal; therefore, it does not matter whether or not the estimate exceeds the 
true optimal cost-to-go, which was important for A*. Although optimal solutions 
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Initial State 




Goal State 



Figure 2.6: Here is bad example for best-first search. Imagine trying to reach a 
state that is directly below the spiral tube. If the initial state starts inside of the 
opening at the top of the tube, the search will progress around the spiral instead 
of leaving the tube and heading straight for the goal. 
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are not found, in many cases, far fewer nodes are explored, which results in much 
faster running times. There is no guarantee, however, that this will happen. The 
worst-case performance of best first search is worst than that of A* and dynamic 
programming. The algorithm is often too greedy because it prefers states that 
"look good" very early in the search. Sometimes the price must be paid for being 
greedy! Figure 2.6 shows a contrived example in which the planning problem 
involves taking small steps in a 3D world. For any specified number, k, of steps, 
it is easy to construct a spiral example that wastes at least k steps in comparison 
to Dijkstra's algorithm. Note that best first search is not systematic. 

Iterative Deepening 

The iterative deepening approach is usually preferable when there is a large branch- 
ing factor. This could occur if there are many actions per state and few states are 
revisited. The idea is to use depth-first search and find all states that are distance 
% or less from xj. If the goal is not found, then the search graph is discarded, 
and depth first is applied to find all states of distance i + 1 or less from xj. This 
generally iterates from i — 1 and proceeds indefinitely until the goal is found. 
The motivation for discarding the work of previous iterations is that the number 
of states reached for % + 1 is expected to far exceed (e.g., by a factor of ten) the 
number reached for %. Therefore, there once the commitment has been made to 
reach level i + 1, all of the previous efforts to low relative cost. The iterative 
deepening method has better worst case performance than breadth-first search 
for many problems. If the nearest goal state is i steps from xi, breadth-first in 
the worst case might reach nearly all states of distance % + 1. This occurs each 
time a state x G" Xq of distance % from xj is reached because all new states that 
can be reached in one step are placed onto Q. The A* idea can be combined with 
iterative depending to yield IDA*, in which % is replaced by C*(x') + G*(x'). In 
each iteration of IDA*, larger and larger values of total cost are allowed [622]. 

2.3.3 Other General Search Schemes 

This section covers two other general templates for search algorithms. The first 
one is simply a "backwards" version of the tree search algorithm in Figure 2.5. 
The second one is a bidirectional approach that grows two search trees, one from 
the initial state, and one from a goal state. 

Backwards Search 

Suppose that there is a single goal state, xq- For many planning problems, it might 
be the case that the branching factor is large when starting from xj. In this case, 
it might be more efficient to start the search at a goal state and work backwards 
until the initial state is encountered. A general template for this approach is given 
in Figure 2.7. an action u G U{x) is applied from x G X, to obtain a new state, 
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x' = f(x,u). For backwards search a frequent computation will be to determine 
for some x', what could be the preceding state, x G X and action u G U(x) such 
that x' = f(x, u)l 

For most problems, it may be preferable to precompute a representation of the 
state transition equation, /, that is "backwards" to be consistent with the search 
algorithm. Some convenient notation will now be constructed for the backwards 
version of /. Let U^ 1 = {(x, u) \ x G X, u G U(x)}, which represents the set of all 
state-action pairs, and can also be considered as the domain of /. Imagine from 
a given state x' G X, the set of all (x,u) G U^ 1 that map to x' using /. This can 
be considered as a backwards action space, defined formally for any x' G X as: 

U-\x f ) = {(x,u) G U' 1 | x = f(x,u)}. (2.2) 

For convenience, let u~ l denote a state-action pair (x,u) which belongs to some 
U~ 1 (x'). From any w -1 G U^ 1 (x'), there is a unique x G X. Thus, let f^ 1 denote 
a backwards state transition equation that yields x from x' and u~ l G U^ 1 (x'). 
Hence, we can write x = f~ l {x' ) u~ 1 ), which looks very similar to the forward 
version, x' = f(x,u). 

The interpretation of f~ l is easy to capture in terms of the graph represen- 
tation. Imagine reversing the direction of every edge. This will make finding a 
plan in the reversed graph using backwards search equivalent to finding one in the 
original graph using forward search. The backwards state transition equation is 
just the version of / that is obtained after reversing all of the edges. Each -u" 1 
is just a reversed edge. Since there is a perfect symmetry with respect to the 
forward search of Section 2.3.1, any of the search algorithm variants from Section 
2.3.2 could be adapted work under the template in Figure 2.7 once f~ l has been 
defined. 

Bidirectional Search 

Now that forward and backwards search have been covered, the next reasonable 
idea is to conduct a bidirectional search. The general search template given in 
Figure 2.8 can be considered as a combination of the two in Figures 2.5 and 2.7. 
One tree is grown from the initial state, and the other is grown from the goal state. 
The search terminates with success when the two trees meet. Failure occurs if both 
priority queues have been exhausted. For many problems bidirectional search can 
dramatically reduce the amount of exploration required to solve the problem. The 
dynamic programming and A* variants of bidirectional search will lead to optimal 
solutions. For best-first and other variants, it may be challenging to ensure that 
the two trees meet quickly. They might come very close to each other, and then fail 
fail to connect. Additional heuristics may help in some settings to help guide the 
trees into each other. One can even extend this framework to allow any number 
of serach trees. This may be desirable in some applications, but connecting the 
trees becomes even more complicated and expensive. 
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BACKWARDS J3EARCH 



1 Q.Insert(xa) 

2 while Q not empty do 

3 x<- Q.GetFirstQ 

4 if x = xi 

5 return SUCCESS 

6 forall u G U~ l {x) 

7 x' <— (f>(x, u) 

8 if x' not visited 

9 Mark x' as visited 

10 Q.Insert{x') 

11 else 

12 Resolve duplicate x' 

13 return FAILURE 



Figure 2.7: A general template for backwards search. 

2.3.4 A Unified View of the Search Methods 

It is convenient to summarize the behavior of all search methods in terms of 
several basic steps. Variations of these steps will appear later for more complicated 
planning problems. For example, in Section 5.4, a large family sampling-based 
motion planning algorithms can be viewed as an extension of the steps presented 
here. The extension in this case is made from a discrete state space to a continous 
state space (the configuration space). 

All of the planning methods from this section followed the same basic template: 

1. Initialization: Let the search graph, G(V,E), be initialized with E empty 
and V containing xj and possibly some other states. If bidirectional search 
is used, then initially, V = {x;,xg}. It is possible to grow more than two 
trees and merge them during the search process. In this case, more states 
can be initialized in V. 

2. Select Node: Choose a node n cur G V for expansion. Let x cur denote its 
associated state. 

3. Apply an Action: In either a forward or backwards direction, a new state, 
x new is obtained. This may arise from x new = f(x,u) for some u G U(x) 
(forward) or x = f(x new ,u) for some u G U(x new ) (backwards). 

4. Insert A Directed Edge in the Graph: If certain algorithm-specific 
tests are passed, then generate an edge from x to x new for the forward case, 
or an edge from x new to x for the backwards case. If x new is not yet in V, it 
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BIDIRECTIONAL .SEARCH 

1 Qj.Insert(xi) 

2 Qc-Insert^xc) 

3 while Qi not empty or Q G not empty do 



4 if Qi not empty 

5 x <- Qj.GetFirstQ 

6 if i G Xg or a; G 

7 return SUCCESS 

8 forall « G U(x) 

9 x' <— /(x, u) 

10 if a;' not visited 

11 Mark x' as visited 

12 Q I .Insert(x') 

13 else 

14 Resolve duplicate re' 

15 if Qc not empty 

16 a:' <- Q G .GetFirstQ 

17 if x' = X/ or x' G 

18 return SUCCESS 

19 forall m" 1 G C/" 1 ^') 

20 x^0(x',m- 1 ) 

21 if x not visited 

22 Mark x as visited 

23 Qo-Insert^x) 

24 else 

25 Resolve duplicate x 



26 return FAILURE 



Figure 2.8: A general template for bidirectional search. 
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will be inserted into V. 3 . 

5. Check for Solution: Determine whether G encodes a path from xj to x G . 
If there is a single search tree, then this is trivial. If there are two or more 
search trees, then this step can become expensive. 

6. Return to Step 2: Iterate unless a solution has been found or some ter- 
mination condition is satisfied, in which case the algorithm reports failure. 

Note that in this summary, several iterations may have to be made to generate 
one iteration in the previous formulations. The forward search algorithm in Figure 
2.5 iterates tries all actions for the first element of Q. If there are k actions, this 
corresonds to k iterations in the algorithm above. 

2.4 Discrete Optimal Planning 

This section extends Formulation 2.2.1 to allow optimal planning problems to 
be defined. Rather than being satisfied with any sequence of actions that leads 
to the goal set, suppose we would like a solution that optimizes some criterion, 
such as time, distance, or energy consumed. Three important extensions will be 
made: 1) a stage index will be added for convenience to indicate the current 
plan step; 2) a cost functional will be introduced, which serves as a kind of taxi 
meter to determine how much cost will accumulate; 3) a termination action, which 
intuitively indicates when it is time to stop the plan and fix the total cost. 

The presentation involves three phases. First, the problem of finding optimal 
paths of a fixed length is covered Section 2.4.1. The approach involves performing 
dynamic programming iterations over the state space. Although this case is not 
very useful by itself, it is much easier to understand than the general case of 
variable-length plans. Once the concepts from this section are understood, their 
extension to variable-length plans will be much clearer, and is covered in Section 
2.4.2. Finally, Section 2.4.3 explains the close relationship between the general 
DP iterations of Section 2.4 and the special case of Dijkstra's algorithm, which 
was covered in Section 2.3.1 as a particular search algorithm. 

With nearly all optimization problems, there is the arbitrary, symmetric issue 
of defining the task in way that requires minimization or maximization. If the 
cost is a kind of energy or expense, then minimization seems sensible, as is typical 
in control theory. If the cost is a kind of reward, as in investing or typical AI 
research, then maximization is preferred. Although this issue remains throughout 
the book, we will choose to minimize everything. If maximization is preferred, 
then multiplying the costs by —1, and maximizing wherever it says to minimize 
(also minimizing where it says to maximize in some later chapters), should suffice. 



3 In some variations, the vertex could be added without a corresponding edge. This would 
start another tree in a multiple-tree approach 
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The fixed-length optimal planning model will be given shortly, but first some 
new notation is introduced. Let 7Tk denote a K-step plan, which is a sequence 
(ui, U2, ■ ■ ., uk) of K actions. Note that if ttk and xj are given, then a sequence 
of states, xi, x 2 , ■ ■ ., xk+i, can be derived using the state transition equation, /. 
Initially, X\ = xj, and each following state is obtained by x^+i = f(x k ,Uk). 

The model is now given; the most important addition with respect to Formu- 
lation 2.2.1 is L, the cost functional. 

Formulation 2.4.1 (Discrete Fixed-Length Optimal Planning) 

1. All of the components from Formulation 2.2.1 are inherited directly: X, 
U(x), f, xj, and X G , except here it is assumed that X is finite. 

2. A number, K, of stages, which is the exact length of a plan (measured as 
the number of actions, u±, u 2 , ■ ■ ., uk)- States will also obtain a stage index: 
Xk+i denotes the state obtained after w fc is applied. 

3. Let L denote a real-valued, additive cost (or loss) functional, which is applied 
to a If -step plan, itk- This means that the sequence, {u\, . . . , uk), of actions 
and the sequence, (xi, . . . ,Xk+i), of states may appear in an expression of 
L. For convenience, let F = K + 1, to denote the final state (note that the 
application of uk advances the stage to K + 1). The cost functional is 



The final term, If(xf), is outside of the sum, and is defined as If{%f) = 
if xf € Xq, and If(xf) = oo, otherwise. 

An important comment must be made regarding If- Including lp in (7.26) 
is actually unnecessary if it is agreed in advance that L will only be applied 
to evaluate plans that reach Xq- It would be undefined for all other plans. The 
algorithms to be presented shortly will also function nicely under this assumption; 
however, the notation and explanation can become more cumbersome because 
the action space must always be restricted to ensure that successful plans are 
produced. Instead of this, the domain of L is extended to include all plans, 
and those that do not reach Xq are penalized with infinite cost so that they are 
eliminated automatically in any optimization steps. At some point, the role of 
If may become confusing, and is helpful to remember that it is just a trick to 
convert feasibility constraints into a straightforward optimization (L = oo means 
not feasible and L < oo means feasible with cost L). 

Now the task is to find a plan that minimizes L. To obtain a feasible planning 
problem like Formulation 2.2.1, but restricted to K-step plans, let l(x, u) = 0. To 
obtain a planning problem that requires minimizing the number of stages, then 
let l(x,u) = I. The possibility also exists of having goals that are less "crisp" by 



K 




(2.3) 



k=i 
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letting If{x) vary for different x G Xq, as opposed to l F {x) = 0. This is much 
more general than what was allowed with feasible planning because now states 
may take on any value, as opposed to being classified as inside or outside of Xq. 

2.4.1 Optimal Fixed-Length Plans 

Consider computing an optimal plan under Formulation 1. One could naively gen- 
erate all length- if sequences of actions and select the sequence that produces the 
best cost, but this would require 0(\U\ K ) running time (imagine K nested loops, 
one for each stage), which is clearly prohibitive. Luckily, dynamic programming 
(DP) principle will help. We first say in words what will appear later in equations. 
The DP idea is that portions of optimal plans are themselves optimal. It would 
be absurd to be able to replace a portion of an optimal plan with a portion that 
produces lower total cost; this contradicts the optimality of the original plan. 

The principle of optimality leads directly to an iterative algorithm that can 
solve a vast collection of optimal planning problems, including those that involve 
variable-length plans, stochastic uncertainties, imperfect state measurements, and 
many other complications. In some cases, the approach can be adapted to the well- 
known Dijkstra's algorithm; however, it is important to realize that this is only a 
special case which applies to a narrower set of problems. The following text will 
describe the general DP iterations, and Section 2.4.3 discusses their connection to 
Dijkstra's algorithm. 

Backwards dynamic programming 

Just as for the search methods, there will be both a forward and backwards version 
of the approach. The backwards case will be covered first. Even though it does 
not appear as straightforward on the surface to progress backwards from the goal, 
it turns out that this case is notationally simpler. The forward case will then be 
covered once some additional notation is introduced. 

The key to deriving long optimal plans from shorter ones lies in the construc- 
tion of coptimal cost-to-go functions over X. For 1 < k < F, let G* k denote the 
cost that accumulates from stage k to F under the execution of the optimal plan: 



Inside of the min of (2.4) are the last K — k + 1 terms of the cost functional, 
(7.26). The optimal cost-to-go for the boundary condition of k = F reduces to 



This makes intuitive sense: since there are no stages in which an action can be 
applied, the final stage cost is immediately received. 




(2.4) 



G* F (x F ) = l F (x F ). 



(2.5) 
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Now consider an algorithm that makes K passes over X, each time computing 
G k from G k+1 , as k ranges from F to 1. In the first iteration, G* F is copied from 
If without significant effort. In the second iteration, G* K is computed for each 
Xk G X as 

g *k( x k) = min {l(x K , u K ) + l F {x F ).} (2.6) 

U K 

Because If = G* F and x F = /(xk,uk), substitutions can be made into (2.6) to 
obtain 

G* K {x K ) = min {l(x K , u K ) + G* F (f(x K , %))} , (2.7) 

U K 

which is straightforward to compute for each xk G X. This computes the costs 
of all optimal one-step plans from stage K to stage F — K + 1. 

It will next be shown that G* k can be computed similarly once G* k+1 is given. 
Carefully study (2.4) and note that it can be written as 

G* k (x k ) = min min \l{x k) u k ) + ^ l(x u v*) + l F (x F ) } (2.8) 

i=k+l 



«fe Uk+1, — ,U K 



by pulling the first term out of the sum, and by separating the minimization over 
Uk from rest, which range from u k+ \ to uk- The second min does not affect the 
l( x k,u k ) term; thus, l(x k ,u k ) can be pulled outside to obtain 



G* k (x k ) = min 



Uk 



K 



l(x k ,u k )+ min <^ V" l(x h Ui) + l(x F ) 

U k + 1 ,...,U K 



(2.9) 



The inner min is exactly the definition of the cost-to-go function G* k+1 , which 
yields the following recurrence: 

G* k (x k ) = min {l(x k ,u k ) + G* k+1 (x k+1 )} , (2.10) 

u k 

in which x k+ i = f(x k ,u k ). Now that the right side of (2.10) depends only on x k , 
u k , and G* k+l) the computation of G* k easily proceeds in 0(|X||£/|) time. Note 
that in each pass over X, some states receive an infinite value only because they 
are not reachable: a fc-step plan from x k to Xq does not exist. In terms of DP, 
this means that an action u k G X(x k ) does not exist that brings x k to some state 
x k+ i G X from which a (k — l)-step plan exists that terminates in X G . 

Summarizing, the computations of cost-to-go functions proceeds as follows: 

X^f* y-^f* X^f* y-^* X^f* /r) 11\ 

Lrp — > Lr x — > Lr i ^_ 1 ••• — > Lr fc _! ••• Lr 2 — > Lr^ ^•- L - L J 

until finally, is determined after 0(-fT|X| |C/|) time. The resulting G\ may be 
applied to yield G\{xi), the optimal cost to get to the goal from xj. It will also 
conveniently give the optimal cost to go for any other initial state, which may be 
infinity for those from which the Xq cannot be completed. 
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Figure 2.9: A five-state example is shown. Each vertex represents a state, and each 
edge represents an input that can be applied to the state transition equation to 
change the state. The weights on the edges represent l(xk, Uk) (xk is the originating 
vertex of the edge) . 




Figure 2.10: The possibilities are shown for advancing forward one stage. This 
is obtained by making two copies of the states from Figure 2.9, one copy for the 
current state, and one for the potential next state. 

It seems nice that the cost of the optimal plan can be computed so easily, but 
how is such a plan extracted? One possibility is to store the action that satisfied 
the min from every state, and at every stage. Unfortunately, this requires 0(i^|X|) 
storage, but it can be reduced to 0(|X|) using the tricks in Section 2.4.2 for the 
more general case of optimizing over variable-length plans. 

Example 2.4.1 (A five-state optimal planning problem) 

Figure 2.9 shows a graph representation of a planning problem in which X = 
{a,c,b,d,e}. Suppose that K — 4, xj — a, and X G = {d}. There will hence 
be four DP iterations, which construct G* 4 , G* 3 , G* 2 , and G\, once the final-stage 
cost-to-go, Gl, is given. 

The cost-to-go functions are: 
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Figure 2.11: By turning Figure 2.10 sideways and copying it K times, a graph 
can be drawn that easily shows all ways to arrive at a final state from an initial 
state by flowing from left to right. The DP computations select automatically the 
optimal route. 

Figures 2.10 and 2.11 help illustrate the computations. For computing G* 4l only b 
and c receive finite values because only they can reach d in one stage. For comput- 
ing G* 3 , only the values G\{b) = 4 and G\{c) = 1 are important. Only paths that 
reach b or c could possibly lead to d in stage k — 5. Note that the minimization in 
(2.10) always chooses the action that produces the best total cost when arriving 
at a vertex in the next stage. ■ 



Forward dynamic programming 

The ideas from Section 2.4.1 may be recycled to yield a symmetrically equivalent 
method that computes cost-to-come functions from the initial stage. Whereas 
backwards DP was able to find optimal plans from all initial states simultaneously, 
forward DP can be used to find optimal plans too all states in X. In the backwards 
case, X G must be fixed, and in the forward case, xj must be fixed. 

The issue of maintaining feasible solutions appears again. In the forward 
direction, the role of lp is not important. It may be applied in the last iteration, 
or it can be dropped altogether for problems that do not have a predetermined 
Xq. However, one must force all plans considered by forward DP to originate 
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from xj. There is the familiar choice of making notation that imposes constraints 
on the action spaces, or simply adding a term that forces infeasible plans to have 
infinite cost. Once again, we chose the latter. 

Let CI denote the optimal cost-to-come from stage 1 to stage k, optimized over 
all (k — l)-step plans. To preclude plans that do not start at x I: the definition of 
C{ is given by 



in which li is a new function that yields liixj) = and li(x) = oo for x ^ xj. 
Thus, any plans that try to start from another state will immediately receive 
infinite cost. 

For an intermediate stage, k e {2, . . . , K} the following represents the optimal 
cost-to-come: 



Note that the sum refers to a sequence of states, x 1 , . . . ,x k -i, which is the result 
of applying the action sequence (ui, . . . ,Uk-i)- The last state, Xk is not included 
because its cost term, l(xk,Uk) requires the application of an action, Uk, which 
has not been chosen. If it is possible to write the cost additively, as l(xk,u>k) = 
h(xk)+h(uk), then the h(xk) part could be included in the cost-to-come definition, 
if desired. This detail will not be considered further. 

As in (2.4) it is assumed in (2.13) that Ui E U(xi) for every % e {1, . . . , k — 1}. 
The resulting Xk, obtained after applying Uk-i must be the same Xk that is named 
in the argument on the right side of (2.13). It might appear odd that x\ appears 
inside of the min above; however, this is not a problem. The state x\ can be 
completely determined once ui, . . . , Uk-i and Xk are given. 

The final step in forward DP is the arrival at the final stage, F. The cost-to- 
come in this case is 



This equation looks the same as (2.7), but lj is used instead of l F . This has the 
effect of filtering the plans that are considered to only those that start at xj. The 
forward DP iterations will find optimal plans to any reachable final state from xj. 
This behavior is complementary to that of backwards DP. In that case, Xq was 
fixed, and optimal plans from any initial state were found. For forward DP, this 
is reversed. 

To express the DP recurrence, one further issue remains. Suppose that C k *_ 1 
is known by induction, and we want to compute C^(xk) for a particular Xk- This 
means that we must start at some state Xk~i and arrive in state Xk by applying 
some action. Once again, the backwards state transition equation from Section 
2.3.3 is useful. Using the stage indices, it is written here as Xk-i = f~ l {xk,u k 1 ). 



C* 1 (x 1 ) = l I (x 1 ), 



(2.12) 




(2.13) 




(2.14) 



2.4. DISCRETE OPTIMAL PLANNING 



43 



Using / 1 , the DP equation is: 

C* k (x k ) = min {Cfc_ 1 (a: fc _i) + Z(a; fc _i,« fc _i)}, (2.15) 
u-ief/- 1 ^) 

in which x^^i = f^ 1 (x k ,u k 1 ) and i^-i G U(xk-i is the input to which u^ 1 E 
U~ 1 (x k ) corresponds. Using (2.15), the final cost-to-come may be iteratively 
computed in 0(i^|X| \U\) time, just as in the case of computing the first-stage 
cost-to-go in backwards dynamic programming. 

Example 2.4.2 (Forward DP for the five-state problem) 

Example 2.4.1 will now be revisited for the case of forward DP with fixed plan 
length for K = 4. The following cost-to-come functions are obtained by direct 
application of (2.15): 



State 


a 


b 


c 


d 


e 


C* 





oo 


oo 


oo 


oo 


c* 


2 


2 


oo 


oo 


oo 


c* 


4 


4 


3 


6 


oo 


ci 


6 


6 


5 


4 


7 


ct 


6 


5 


5 


6 


5 



It will be helpful to refer to Figures 2.10 and 2.11 once again. The first row cor- 
responds to the immediate application of li. In the second row, finite values are 
obtained for a and b, which are reachable in one stage from xj = a. The iterations 
continue until k — 5, at which point that optimal cost-to-come is determined for 
every state. ■ 



2.4.2 The General Case 

The dynamic programming techniques for fixed-length plans can be generalized 
nicely to the more interesting case in which plans of varying lengths are allowed. 
There will be no bound on the maximal length of a plan; therefore, the current 
case is truly a generalization of Formulation 2.2.1 because arbitrarily long plans 
may be attempted in efforts to reach X G . 

The model for the general case does not require the specification of K and also 
introduces a special action, ut- 

Formulation 2.4.2 (Discrete Optimal Planning) 

1. All of the components from Formulation 2.2.1 are inherited directly: X, 
U(x), /, xi, and Xq. Also, the notion of stages from Formulation will be 
used. 
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2. Let L denote a real-valued, additive cost (or loss) functional, which may be 
applied to any K-step plan, ttk, to yield 



In comparison with L from Formulation 1, the present expression does not 
consider K as a predetermined constant. It will now vary, depending on the 
length of the plan. Thus, the domain of L is much larger. 

3. Each U(x) contains a special termination action, ut- If «t is applied to x^, 
at stage k, then the action is repeatedly applied forever, the state remains 
in Xk forever, and no more cost accumulates. Thus, for all i > k, ut = ut, 
Xi = Xk, and l(xi,ur) = 0. 

The termination action is the key to allowing plans of different lengths. It will 
appear throughout this book. Suppose we would like to perform the DP iterations 
for K = 5, and there is a two-step plan, (ui,u 2 ), that that arrives in Xq from 
Xj. This plan is equivalent to the five-step plan {u\, u<i, ut, ut, ut) because the 
termination action does not change the state nor does it accumulate cost. The 
resulting five-step plan will reach Xq and cost the same as {u\,U2). With this 
simple extension, the forward and backwards DP methods of Section 2.4.1 may 
be applied for any fixed K to optimize over all plans of length K or less (instead 
of fixed K). 

The next step is to remove the dependency on K. Consider running backwards 
DP indefinitely. At some point, G\ will be computed, but there is no reason why 
the process cannot be continued onward to G* , G*_ ± , etc. Recall that xj is not 
utilized in the backwards DP; therefore, there is no concern regarding the starting 
state of the plans. Suppose that backwards dynamic programming was used for 
K = 16 and was executed down to G*_ 8 . This considers all plans of length 25 
or less. Note that for convenience, it is harmless to add 9 to all stage indices to 
shift all of the cost-to-go functions. Instead of running from G*_ 8 to G\ 6 , they can 
run from G\ to G 25 . The shifting of indices is allowed because none of the costs 
depend on the particular index that is given to the stage. The only important 
aspect of the DP computations is that they proceed backwards, and sequentially 
from state to stage. 

Eventually, enough iterations will have executed so that an optimal plan is 
known from every state that can reach Xq. From that stage, say k, onward, the 
cost-to-go values from one iteration to the next will be stationary, meaning that 
for all % < k, G f *_ 1 (x) = G*(x) for all x £ X. Once the stationary condition is 
reached, the cost-to-go no longer depends on a particular stage k. 

Are there any conditions under which backwards DP could be run forever, 
with each iteration producing a cost-to-go function that in which some values are 
different from the previous iteration? If l{x,u) is nonnegative for all x £ X and 



K 




(2.16) 



k=l 
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u G U(x), then this could never happen. It could certainly be true that for any 
fixed K, longer plans will exist, but this cannot be said of optimal plans. For every 
x G X, there either exists a plan that reaches Xq or there does not. For each state 
from which there exists a plan that reaches X G , consider the number of steps in 
the optimal plan. Take the maximum number of steps over such optimal plans, 
one from each state that can reach Xq. This serves as a limit on the number of 
DP iterations that are needed. Any further iterations will just consider solutions 
that are worse than the ones already considered (some may be equivalent due to 
the termination action and shifting of stages). Some trouble might occur if l(x, u) 
contains negative values. If in the corresponding graph representation there is a 
cycle whose total cost is negative that it will be preferable to execute a plan that 
travels around the cycle forever, reducing the total cost to — oo. We will assume 
that the cost functional is defined in a sensible way so that such negative cycles do 
not exist. Otherwise, the optimization model itself appears flawed. Some negative 
values for l(x, u), however, are allowed as long as there are no cycles. 

Let —K denote the iteration at which the cost-to-go values become stationary. 
At this point, a real-valued, optimal cost-to-go function, G* : X — > R, may be 
expressed by assigning G* = G*_ K . In other words, the particular stage index no 
longer matters. The value G*(x) gives the optimal cost to go from state x G X 
to the specific goal state x G . The optimal cost-to-go, G*, can be used to recover 
the optimal actions, if they were not explicitly stored by the algorithm. Consider 
starting from some x G X. What is the optimal next action? This is given by 

argmin{Z(:r,w) + G*(f(x,u))} , (2.17) 

u 

which is the action, u, that minizes an expression that is very similar to (2.10). 
The only difference is that the stage indices are dropped because the cost-to-go 
values no longer depend on them. After applying u, the state transition equation 
is used to obtain x' = f(x, u), and (2.17) may be applied again on x' . This process 
continues until a state in Xq is reached. This procedure is based directly on the 
DP equations; therefore, it recovers the optimal plan. The function G* serves 
as a kind of guide that leads the system from any initial state into the goal set 
optimally. This can be considered as a special case of a navigation function, which 
will covered in Chapter 8. 

Just as in the case of fixed-length plans, the direction of the DP iterations 
may be reversed to obtain a forward DP algorithm that solves the variable-length 
planning problem. In this case, the backwards state transition equation, is 
used once again. Also, the initial cost term // instead of lp, just as in (2.13). 
The forward DP algorithm can start at k — 1, and then it iterates until the cost- 
to-come become stationary. Once again, the termination action, ut, perserves 
the cost of plans that arrived at a state in earlier iterations. Note that it is not 
required to specify Xq for these forward DP iterations. A counterpart to G* may 
be obtained, from which optimal actions can be recovered. When the cost-to-come 
values become stationary, an optimal cost-to-come function, C* : X — > R, may 
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Figure 2.12: Compare this figure to Figure 2.11, for which K was fixed at 4. The 
effect of the termination action is depicted as dashed-line edges that yield cost 
when traversed. This enables plans of all finite lengths to be considered. Also, 
the stages extend indefinitely to the left (for the case of backwards DP). 



be expressed by assigning C* = G* F , in which F is the final stage reached when 
the algorithm terminates. The value C*(x) gives the cost of an optimal plan that 
starts from xj and reaches x. The optimal action sequence for any specified goal 
xq £ X can be obtained using 



arg mm {C^r^x^-^ + lif-'ix^- 1 ),^)} , 



(2.18) 



which is the forward DP counterpart of (2.17). The vl is the action in U (/ _1 (:r, u^ 1 )) 
that yields x when the state transition equation, /, is applied. The iterations pro- 
ceed backwards from xq, and terminate when xj is reached. 



Example 2.4.3 (DP iterations for variable-length plans) 

Once again, Example 2.4.1 is revisited; however, this time the plan length is not 
fixed thanks to the termination action. Its effect is depicted in Figure 2.12 by the 
superposition of new edges that have zero cost. It might appear at first there is 
no incentive to choose other actions, but remember that any plan that does not 
terminate in state xg = d will receive infinite cost. 
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After a few backwards DP iterations, the cost-to-go values become stationary. 
After this point, the termination action is being applied from all reachable states 
and no further loss accumulates. The final cost-to-go function is defined to be G*. 
Since d is not reachable from e, G*(e) = oo. 

As an example of using (2.17) to recover optimal actions, consider starting 
from state a. The action that leads to b is chosen next because the total cost 
2 + G*(b) =4 is better than 2 + G*(a) = 6 (the 2 comes from the action cost). 
From state b, the optimal action leads to c, which produces total cost 1+G*(c) = 1. 
Similarly, the next action leads to d e Xq, which terminates the plan. 

Using forward DP, suppose that xi = b. The following cost-to-come functions 
are obtained: 
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For any finite value that remains content from one iteration to the next, the 
termination action was applied. Note that the last DP iteration is useless in 
this example. Once L\ 3 is computed, the optimal cost-to-come to every possible 
state from xj is determined, and future cost-to-come functions will look identical. 
Therefore, the final cost-to-come is renamed to C*. ■ 



2.4.3 Dijkstra Revisited 

So far two different kinds of dynamic programming have been covered. The meth- 
ods of Section 2.4.2 involve repeated computations over the entire state space. 
Dijkstra's algorithm from Section 2.3.2 flows only once through the state space, 
but with the additional overhead of maintaining which states are alive. 

Dijkstra's algorithm can be derived by focusing on the forward dynamic pro- 
gramming computations, as in Example 2.4.3, and identifying exactly where the 
"interesting" changes occur. Recall that for Dijkstra's algorithm, it was assumed 
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that all costs are nonnegative. For any states that are not reachable, their values 
remain at infinity. They are precisely the unvisited states. States for which the op- 
timal cost-to-come has already been finalized are dead. For the remaining states, 
an initial cost is obtained, but this cost may be lowered multiple times until the 
optimal cost is obtained. All states for which the cost is finite, but possibly not 
optimal, are in the queue, Q. 

After understanding the general DP iterations of this section, it is easier to 
understand why Dijkstra's form of dynamic programming correctly computes op- 
timal solutions. It is clear that the unvisited states will remain at infinity in 
both algorithms because no plan has reached them. It is helpful to consider the 
backwards DP iterations in Example 2.4.3 for comparison. In a sense, Dijkstra's 
algorithm is very much like the general DP iterations, except that it efficiently 
maintains the set of states within with cost-to-go values change. It correctly in- 
serts any states that are reached for the first time, changing their cost-to-come 
from infinity to a finite value. The values are changed in the same manner as 
in the DP iterations. At the end of both algorithms, the resulting values should 
correspond to the stationary, optimal cost-to-come, C*. 

At the end of both algorithms, the resulting values should correspond to the 
stationary, optimal cost-to-come, C*. 

If Dijkstra's algorithm seems so clever, then why have we spent time covering 
the general DP algorithm? For some problems it may become too expensive to 
maintain the sorted queue, and the DP iterations could provide a more efficient 
alternative. A more important reason is that the general DP iterations apply to 
a much broader class of problems by simple extensions of the method. Examples 
to which that apply include optimal planning over continuous state spaces (Sec- 
tion ??), stochastic optimal planning (Section ??), and computing dynamic game 
equilibria (Section ??). In some cases, it is still possible to obtain a Dijkstra-like 
algorithm by focusing the computation on the "interesting" region; however, as the 
model becomes more complicated, it may be inefficient or impossible in practice 
to maintain this region. Therefore, it is important to have a good understanding 
of both to determine which is most appropriate for a given problem. 

Dijkstra's algorithm belongs to a broader family of label- correcting algorithms, 
which all produce optimal plans by making small modifications to the general 
forward search algorithm in Figure 2.5. Figure 2.13 shows the resulting algorithm. 
The main difference is to allow states to become alive again if a better cost-to- 
come is found. This enables other cost-to-come values to be improved accordingly. 
This is not important for Dijkstra's algorithm and A* because they only need to 
visit each state once. Thus, the algorithms in Figures 2.5 and 2.13 are essentially 
the same in this case. However, the label-correcting algorithm produces optimal 
solutions for any sorting of Q, including FIFO (breadth first) and LIFO (depth 
first), as long as X is finite. If X is not finite, then the issue of systematic search 
dominates because one must guarantee that states are revisited sufficiently many 
times to guarantee that optimal solutions will eventually be found. 
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FORWARD _LABEL_CORRECTING(:r G ) 



1 Set G{x) = oo for all x ^ xr, and set G(xr) = 

2 Q.Insert(xj) 

3 while Q not empty do 

4 x <- Q.GetFirstQ 

5 forall u G C/(x) 

6 re' <— /(x, u) 

7 if + Z(ar, u) < min{G'(x / ), G(a; G )} then 

8 G(x') <- G(x) +l(x,u) 

9 if x' ^ rc G then 

10 Q.Insert(x') 



Figure 2.13: A generalization of Dijkstra's algorithm, which upon termination 
produces an optimal plan (if one exists) for any prioritization of Q, as long as X 
is finite. Compare this to Figure 2.5. 

Another important difference is that the algorithm uses the cost at the goal 
state to prune away many candidate paths, which is shown in Line 7. Thus, it 
is only formulated to work for a single goal state; it can be adapted to work 
for multiple goal states, but performance degrades. The motivation for including 
C(xg) in Line 7 is that there is no need to worry about improving costs at some 
state, x', if its new cost-to-come would be higher than C(xg) because there is no 
way it could be along a path that improves the cost to go to xq. Similarly, x G is 
not inserted in Line 10 because there is no need to consider plans that have x G 
as an intermediate state. To recover the plan, either pointers can be stored from 
x to x' each time an update is made in Line 7, or the final, optimal cost-to-come, 
C*, can be used to recover the actioins using (2.18). 

2.5 Logic-Based Representations of Planning 

For many discrete planning problems that we would hope a computer can solve, the 
state space is enormous (e.g., 10 100 states). Therefore, substantial effort has been 
invested in constructing implicit encodings of problems in hopes that the entire 
state space does not have to be explored by the algorithm to solve the problem. 
This will be a recurring theme throughout the planning algorithms covered in this 
book; therefore, it is important to pay close attention to representations. Many 
planning problems can appear trivial once everything has been explicitly given. 

Logic-based representations have been popular for construcing such implicit 
representations of discrete planning. One historical reason is that such repre- 
sentations were the basis of the majority of artificial intelligence research during 
the 1950s-1980s. Another reason is that they have useful for representing certain 
kinds of planning problems very compactly. It may be helpful to think of these 
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representations as compression schemes. A string such as "010101010101..." may 
compress very nicely, while it is impossible to substantially compress a random 
string of bits. Similar principles are true for discrete planning. Some problems 
contain a kind of regularity that enables them to be expressed compactly, while 
for others it may be impossible to find such representations. This is why there 
has been a variety of representation logics proposed through decades of planning 
research. 

Another reason for using logic-based representations is that many discrete 
planning algorithms are implemented in large software systems. At some point, 
when these systems solve a problem, they must provide the complete plan to a 
user, who may or may not care about the internals of planning. Logic-based rep- 
resentations have seemed convenient for producing output that logically explains 
the steps involves to arrive at some goal. Other possibilities may exist, but logic 
has been a first choice due to its historical popularity. 

In spite of these advantages, one shortcoming with logic-based representations 
is that they are difficult to generalize to enable concepts such as modeling uncer- 
tainty, unpredictability, sensing errors, and game theory to be incorporated into 
planning. This is the main reason why the state space representation has been 
used so far: it will be easy to extend and adapt to the problems covered through- 
out this book. Nevertheless, it is important to study logic-based representations 
to understand the relationship between the vast majority of discrete planning re- 
search and other problems considered in this book, such as motion planning, or 
planning with differential constraints. There are many recurring themes through- 
out these different kinds of problems, even though historically they have been 
investigated by separate research communities. Understanding these connections 
well will give you a powerful understanding of planning issues across all of these 
areas. 

2.5.1 A STRIPS-Like Representation 

STRIPS-like representations have been the most common logic-based representa- 
tion for discrete planning problems. This refers to the STRIPS system, which is 
considered one of the first planning algorithms and representations [247]; its name 
means STanford Research Institute Problem Solver. The original representation 
used first-order logic, which had great expressive power but many technical diffi- 
culties. Therefore, the representation was later restricted to use only propositional 
logic [583], which is similar to the form introduced in this section. There are many 
variations of STRIPS-like representations, one of which is presented here. 
The following model is given, followed by a detailed explanation. 

Formulation 2.5.1 (STRIPS-Like Planning) 

1. A nonempty set, /, of instances. 
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2. A nonempty set, P, of predicates, which are binary-valued (partial) functions 
of one of more instances. Each application of a predicate to a specific set 
of instances is called a positive literal if the predicate is true or a negative 
literal if it is false . 

3. A nonempty set, O, of operators, each of which has: 1) preconditions, which 
is a set of positive and negative literals that must hold for the operator to 
apply, and 2) effects, which is a set of positive and negative literals that are 
the result of applying the operator. 

4. An initial set, S, which is expressed as a set of positive literals. All literals 
not appearing in S are assumed to be negative. 

5. A goal set, G, which is expressed as a set of both positive and negative 
literals. 

Formulation 2.5.1 provides a definition of discrete feasible planning expressed 
in a STRIPS-like representation. The three most important components are the 
sets of instances, I, predicates, P, and operators, O. Informally, the instances 
characterize the complete set of distinct things that exist in the world. They 
could for example be books, cars, trees, etc. The predicates correspond to basic 
properties or statements that can be formed regarding the instances. For example, 
a predicate called Under might be used to indicate things like Under {Book, Table) 
(the book is under the table) or Under [Dirt, Rug). When a predicate is shown 
with instances, such as Under {Dirt, Rug), then it is called a literal, which must 
either have the value true or false . If it is true , it is called a positive literal; 
otherwise, it is called a negative literal. A predicate can be interpreted as a kind 
of function that yields true or false values; however, it is important to note 
that it is only a partial function because it might not be desirable to allow any 
instance to be inserted as an argument to the predicate. 

The role of an operator is to change the world. To be applicable, a set of 
preconditions that must all be satisfied. Each element of this set is a literal along 
with required a TRUE or FALSE value for the operator to be applicable. Any 
literals that can be formed from the predicates, but are not mentioned in the 
preconditions, may assume any value for applicability of the operator. If the 
operator is applied, then the world is updated in a manner precisely specified by 
the set of effects. This set of literals indicates positive and negative literals that 
will result from the application if the operator. All other literals that could be 
constructed will retain their values if they do not appear in the effects. 

The planning problem is expressed in terms of an initial set, S, of positive 
literals, and a goal set, G of positive and negative literals. The task is to find a 
sequence of operators that when applied in succession will transform the world 
from the initial state into one in which all literals of G are satisfied. For each 
operator, the preconditions must also be satisfied before it can be applied. 

The following example illustrates Formulation 2.5.1. 
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Figure 2.14: An example that involves putting batteries into a flashlight. 

Example 2.5.1 Imagine a planning problem that involves putting two batteries 
into a flashlight, as shown in Figure 2.14. The set of instances are 



Two different predicates will be defined, On and In, each of which is a partial 
function on /. The predicate On may only be applied to evaluate whether the 
Cap is On the Flashlight, and is written as OniCap, Flashlight). The pred- 
icate in may be applied in the following two ways: In(Batteryl, Flashlight), 
I n(Battery2, Flashlight), to indicate whether or not either battery is in the 
flashlight. Recall that predicates are only partial functions in general. For pred- 
icate In it is not desirable to apply any instance to any argument. For example, 
I n(B attery 1, Batter yl), and I n{Flashlight, Batter y2) are senseless to maintain 
(they could be included in the model, always retaining a negative value, but it is 
inefficient). 

The initial set is 



S = {OniCap, Flashlight), ->In(Batteryl, Flashlight), ->In{Battery2, Flashlight)} , 

(2.20) 



which means that the first literal is positive, and the remaining two are negative, 
as indicated by the preceding -i symbol (the cap is on the flashlight, but the 
batteries are outside). The goal state is 

G = {OniCap, Flashlight) , In[Battery\, Flashlight), In(Battery2, Flashlight)}. 



which means that both batteries must be in the flashlight, and the cap is on the 
flashlight. 



/ = {Battery!, Battery2, Cap, Flashlight}. 



(2.19) 



(2.21) 
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Name 



Preconditions 



Effects 



PlaceCap {^On(Cap, Flashlight)} 

RemoveCap {On(Cap, Flashlight)} 

Insert(i) {^On(Cap, Flashlight), -<In(i, Flashlight)} 

Remove(i) {->On(Cap, Flashlight), In(i, Flashlight)} 



{On(Cap, Flashlight)} 
{->On(Cap, Flashlight)} 
{In(i, Flashlight)} 
{->In(i, Flashlight)}) 



Table 2.1: Four operators for the flashlight problem. Note that an operator can 
be expressed with variable argument (s) for which different instances could be 
substituted. 



The set O consists of the four operators, which are shown in Figure 2.1. Here 
is a plan that reaches the goal state in the smallest number of steps: 

(RemoveCap, Insert(Batteryl) , Insert(Battery2) , PlaceCap) (2.22) 

In plain english, it simply says to take the cap off, put the batteries in, and place 
the cap back on. 

This example appears quite simple, and one would expect a planning algo- 
rithm to easily find such a solution. It can be made more challenging by adding 
many more instances to /, such as more batteries, more flashlights, and a bunch of 
objects that are irrelevant to achieving the goal. Also, many other predicates and 
operators can be added so that the different combinations of operators becomes 
overwhelming. ■ 



2.5.2 Converting to the State Space Representation 

It is useful to characterize the relationship between Model 2.5.1 and the original 
formulation discrete feasible planning, Formulation 2.2.1. One benefit is that it 
will immediately indicate how the search methods of Section 2.3 can be adapted 
to work for logic-based representations. It is also helpful to understand the rela- 
tionships between the algorithmic complexities of the two representations. 

Up to now, the notion of "state" has only been vaguely mentioned in the 
context of the STRIPS-like representation. Now consider making this more con- 
crete. Suppose that every predicate has k arguments, and in each argument any 
instance could appear. This means that there are \P\ \I\ k different literals at any 
given time, which corresponds to all ways to substitute instances into all argu- 
ments of all predicates. Each literal may be either true or false . The complete 
set of literals may be encoded as a binary string by imposing a linear ordering on 
the instances and predicates. The state of the world is then specified in order. 
Using Example 2.5.1, this might appear like: 



(On(Capl, Flashlightl), -^On(Cap2, Flashlightl), . . . , In(Battery7, Flashlights), . . .). 

(2.23) 
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Using the binary string, each element can be "0" to denote FALSE , or "1" to 
denote true . The resulting state would be x = 10 • • • 1 • • • , for the example 
above. The length of the string is thus \P\ \I\ k . The total number of possible 
states of the world that could possibly be distinguished corresponds to the set of 
all possible bit strings, which is of size 

2™ fc . (2.24) 

The implication is that with a very small number of instances and predicates, an 
enormous state space can be generated. Even though the search algorithms of 
Section 2.3 may appear efficient with respect to size of the search graph (or the 
number of states), the algorithms appear horribly inefficient with respect to the 
sizes of P and /. This has motivated substantial efforts on the development of 
heuristics to help guide the search more efficiently by exploiting the structure of 
specific representations. 

The next step in convering to a state space representation is to encode the 
initial state xi as a string. The goal set, X G , is the set of all strings that are 
consistent with the goal positive and negative goal literals. This can be compressed 
by extending the string alphabet to include a "don't care" symbol, 5. A single 
string that has a "0" for each negative literal, a "1" for each positive literal, and 
a "<5" for all others would suffice in representing any X G that is expressed with 
positive and negative literals. 

The next step is to convert the operators. For each state, x G X, the set 
U(x) will represent the set of operators with preconditions that are satisfied by 
x. To apply the search techniques of Section 2.3, note that it is not necessary to 
determine U (x) explicitly in advance for all x G X. Instead, it can be computed 
whenever each x is encountered for the first time in the search. The effect of the 
operator is encoded by the state transition equation. From a given x G X, the 
next state, f(x,u), is obtained by flipping the bits as prescribed by the effects 
part of the operator. 

All of the components of Formulation 2.2.1 have been derived from the com- 
ponents of Formulation 2.5.1. Adapting the search techniques of Section 2.3 is 
straightforward. It is also straightforward to extend Formulation 2.5.1 to repre- 
sent optimal planning. A cost can be associated with each operator and set of 
literals that capture the current state. This will express l(x,u) of the cost func- 
tional, L, from Section 2.4. Thus, it is also possible to adapt the DP iterations to 
work under the logic-based representation, yielding optimal plans. 

2.5.3 Logic-Based Planning 

Need to give a brief survey of heuristic planning methods that work directly with 
the logic-based representation. 
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Literature 

(This will get filled in a little more later. Here are some references for now.) 

• Introduction of DP [63, 64] 

• Graph search algorithms [176] 

• Logic representations [247, 583] 

• AI search [409, 622, 634] 

• Discrete-time optimal control [19, 70, 67] 

• Recent survey on AI planning (which they rename to automated planning, 
which expands considerably the subject of Section 2.5. This is an excellent 
source of material which is also planning, but is complementary to this book 
in many ways. [274] 

• More coverage of labeling algorithms [67] 

Exercises 

(Exercises in italics are not yet fully specified) 

1. A simple example to simulate the algorithms. Verify that forward DP itera- 
tions and Dijkstra get the same result. 

2. Try implementing and experimenting with some search variants. 

3. Using A* search the performance degrades substantially when there are 
many alternative solutions that are all optimal, or at least close to opti- 
mal. Implement A* search and evaluate it on various labyrinth problems, 
based on Example 2.2.1. Compare the performance for two different cases: 

(a) Using \i' — i\ + \ f — j\ as the heuristic, as suggested in Section 2.3.2. 

(b) Using — i\ 2 + \j' — j\ 2 as the heuristic. 

Which heuristic seems superior? Explain your answer. 

4. Design some kind of multiresolution expanding search algorithm for the in- 
finite tile floor. 

5. Play with randomization on the grid problem. 

6. Try to construct a worst-case example for best-first search that has proper- 
ties similar to that shown in Figure 2.6, but instead involves moving in a 
2D world with obstacles, as introduced in Example 2.2.1. 
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7. It turns out that the general DP iterations can be generalized to a loss 
functional of the form 

K 



in which l(xk,Uk) is replaced by l(xk,Uk,Xk+i)- 

(a) Show that the dynamic programming principle can be appled in this 
more general settings to obtain forward and backwards DP iterations 
that solve the fixed-length optimal planning problem. 

(b) Do the same, but for the more general problem of variable-length plans, 
which uses termination conditions. 

8. The cost functional can be generalized to become stage-dependent, which 
means that the cost might depend on the particular stage, k, in addition to 
the state, Xk, and the action u^. Extend the DP algorithms of Section 2.4.1 
to work for this case, and show that they give optimal solutions. Each term 
of the more-general cost-functional should be denoted as l(xk,Uk, k). 

9. Recall from Section 2.4.2 the method of defining a termination action, ut 
to make the DP iterations work correctly for variable-length planning. In- 
stead of requiring that one remains at the same state, it is also possible to 
formulate the problem by creating a special state, called the terminal state, 
xt- Whenever ut is applied, the state becomes xt- Describe in detail how 
to modify the cost functional, state transition equation, and any other nec- 
essary components so that the DP iterations will correctly compute shortest 
plans. 

10. Dijkstra's algorithm was presented as a kind of forward search in Section 



(a) Derive a backwards version of Dijkstra's algorithm that starts from the 
goal. Show that it always yields optimal plans. 

(b) Describe the relationship between the algorithm from part (a) and the 
backwards DP iterations from 2.4.2. 

(a) Derive a backwards version of the A* algorithm and show that it yields 
optimal plans. 

11. Reformulate the general forward search algorithm of Section 2.3.1 so that 
it is expressed in terms of the STRIPS-like representation. Carefully con- 
sider what needs to be explicitly constructed by the algorithm and what is 
considered only implicitly. 




(2.25) 



k=i 



2.3.1. 



12. 



Experiment with the original STRIPS heuristic. 



Part II 
Motion Planning 
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Overview of Part II: Motion Planning 



Planning in Continuous Spaces 

Part II makes the transition from discrete to continuous state spaces. Two alter- 
native titles may be considered for this part: 1) motion planning, and 2) planning 
in continuous state spaces. Chapters 3-8 are based on research from the field of 
motion planning, which has been building since the 1970s; therefore, the name 
motion planning is widely known to refer to the collection of models and algo- 
rithms that will be covered. On the other hand, it is convenient to also think of 
Part II as planning in continuous spaces because this is the primary distinction 
with respect to most other forms of planning. 

In addition, motion planning will frequently refer to motions of a robot in a 2D 
or 3D world that contains obstacles. The robot could model an actual robot, or 
may any other collection of moving bodies, such as humans or flexible molecules. 
A motion plan involves determining what motions are appropriate for the robot so 
that it reaches a goal state without colliding with obstacles. An earlier name for 
motion planning is the Piano Movers' Problem, which brings to mind the image of 
trying to move a grand piano through narrow passages in a house. Have you ever 
been involved in an argument about how to move a sofa up some stairs? Motion 
planning tries to resolve such debates. 

Many issues that arose in Chapter 2 will appear once again in motion planning. 
Two themes that may help to see the connection are: 



Implicit representations 

A familiar theme from Chapter 2 is that planning algorithms must deal with im- 
plicit representations of the state space. In motion planning, this will become even 
more important because the state space is uncountably infinite. Furthermore, a 
complicated transformation exists between the world in which the models are de- 
fined and the space in which the planning occurs. Chapter 3 covers ways to model 
motion planning problems, which includes defining 2D and 3D geometric models 
and transforming them. Chapter 4 introduces the state space that arises for these 
problems. Following motion planning literature [504, 437], we will refer to this 
state space as the configuration space. The dimension of the configuration space 
corresponds to the number of degrees of freedom of the geometric model. Using 
the configuration space, motion planning will be viewed as a kind of search in an 
implicitly-represented, high-dimensional state space. One additional complication 
is that configuration spaces have unusual topological structure that must be cor- 
rectly characterized to ensure correct operation of planning algorithms. A motion 
plan will then be defined as a continuous path in the configuration space. 
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Continuous — > discrete 

A central theme throughout motion planning is to transform the continuous model 
into a discrete one. Because of this transformation, many algorithms from Chap- 
ter 2 are embedded in motion planning algorithms. There are two alternatives 
to achieving this, which are covered in Chapters 6 and 5, respectively. Chapter 
6 covers combinatorial motion planning, which means that from the input model 
the algorithms build a discrete representation that exactly represents the origi- 
nal problem. This leads to complete planning approaches, which are guaranteed 
to find a solution when it exists, or correctly report failure if one does not ex- 
ist. Chapter 5 covers sampling-based motion planning, which refers to algorithms 
that use collision detection methods to sample the configuration space and con- 
duct discrete searches that utilize these samples. In this case, completeness is 
sacrificed, but is often replaced with a weaker notion, such as resolution com- 
pleteness or probabilistic completeness. It is important to study both Chapters 6 
and 5 because each methodology has its strengths and weaknesses. Combinatorial 
methods can solve virtually any motion planning problem, and in some restricted 
cases, very elegant solutions may be efficiently constructed in practice. However, 
for the majority of "industrial grade" motion planning problems, the running 
time and implementation difficulty of these algorithms make them prohibitive. 
Sampling-based algorithms have fulfilled much of this need in recent years by 
solving challenging problems in several settings, such as automobile assembly, hu- 
manoid robot planning, and conformational analysis in drug design. Although the 
completeness guarantees are weaker, the efficiency and ease of implementation of 
these methods has bolstered interest in applying motion planning algorithms to a 
wide variety of applications. 

Two additional chapters appear in Part II. Chapter 7 covers several exten- 
sions of the basic motion planning problem from the earlier chapters. These 
extensions include avoiding moving obstacles, multiple robot coordination, ma- 
nipulation planning, and planning with closed kinematic chains. Algorithms that 
solve these problems build on the principles of earlier chapters, but each extension 
involves new challenges. 

Chapter 8 is a transitional chapter that involves many elements of motion plan- 
ning, but is additionally concerned with gracefully recovering from unexpected 
deviations during execution. Although uncertainty in predicting the future is 
not explicitly modeled until Part III, Chapter 8 redefines the notion of a plan 
to be a function over state space, as opposed to being a path through it. The 
function gives the appropriate actions to take during exection, regardless of what 
configuration is entered. This allows the true configuration to drift away from 
the commanded configuration. In later chapters, such uncertainties will be explic- 
itly modeled, but this comes at greater modeling and computational costs. It is 
worthwhile to develop effective ways to avoid this. 



Chapter 3 



Geometric Representations and 
Transformations 



Chapter Status 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



This chapter provides important background material that will be needed for 
Part II. Formulating and solving motion planning problems requires defining and 
manipulating complicated geometric models of a system of bodies in space. Sec- 
tion 3.1 introduces geometric modeling, which focuses mainly on semi-algebraic 
modeling because it is an important part of Chapter 6. If your interest is only 
in Chapter 6, then understanding semi-algebraic models is not critical. Sections 
3.2 and 3.3 describe how to transform a single body and a chain of bodies, re- 
spectively. This will enable the robot to "move". These sections are essential for 
understanding all of Part II, and many sections beyond. It is expected that many 
readers will already have some or all of this background (especially Section 3.2, 
but it is included for completeness. Section 3.4 extends the framework for trans- 
forming chains of bodies to transforming trees of bodies, which allows modeling 
of complicated systems, such as humanoid robots and flexible organic molecules. 
Finally, Section 3.5 briefly covers transformations that do not assume the bodies 
are rigid. 

3.1 Geometric Modeling 

A wide variety of approaches and techniques for geometric modeling exist, and 
the particular choice usually depends on the application and the difficulty of the 
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problem. In most cases, there are generally two alternatives: 1) a boundary repre- 
sentation, and 2) a solid representation. Suppose we would like to define a model 
of a planet. Using a boundary representation, we might write the equation of a 
sphere that roughly coincides with the planet's surface. Using a solid represen- 
tation, we would describe the set of all points that are contained in the sphere. 
Both alternatives will be considered in this section. 

The first task is to define the world, W, for which there are two possible 
choices: 1) a 2D world, in which W = M 2 , and 2) a 3D world, in which W = M 3 . 
These choices should be sufficient for most problems; however, one might also 
want to allow more complicated worlds, such as the surface of a sphere or even a 
higher-dimensional space. Such generalities are avoided in this book because their 
current applications are limited. 

Unless otherwise stated, the world generally contains two kinds of entities: 

1. Obstacles: Portions of the world that are "permanently" occupied, for ex- 
ample, as in the walls of a building. 

2. Robots: Geometric bodies that are controllable via a motion plan. 

Based on the terminology, one obvious application is to model a robot that moves 
around in a building, however, many other possibilities exist. For example, the 
robot could be a flexible molecule and the obstacles could be a folded protein. An 
another example, the robot could by a virtual human in a graphical simulation 
that involves obstacles (imagine the family of Doom-like adventure games). 

This section presents a method of systematically constructing representations 
of obstacles and robots using a collection of primitives. Both obstacles and robots 
will be considered as (closed) subsets of W. Let the obstacle region, O, denote the 
set of all points in W that lie in one or more obstacles; hence, O C W. The next 
step is to define a systematic way of representing O that will have great expressive 
power and be computationally efficient. Robots will be defined in a similar way; 
however, this will be deferred until Section 3.2, where transformations of geometric 
bodies are defined. 

3.1.1 Polygonal and Polyhedral Models 

In Sections 3.1.1 and 3.1.2, a solid representation of O will be developed in terms 
of a combination of primitives. Each primitive, Hi, represents a subset of W 
that is easy to represent and manipulate. A complicated obstacle region will 
be represented by taking finite, Boolean combinations of primitives. Using set 
theory, this implies that O can also be defined in terms of a finite number of 
unions, intersections, and set differences of primitives. 

Convex polygons First consider O for the case in which the obstacle region 
is a convex, polygonal subset of a 2D world, W = M 2 . A subset, X C W 1 is 
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called convex if and only if for any pair of points in X, all points along the line 
segment that connects them are contained in X. More precisely, this means that 
for any x\,x 2 G X, all points that can be expressed in the form Xxi + (1 — \)x 2 
(linear interpolation), for some scalar A G (0, 1), must also lie in X. Intuitively, X 
contains no pockets or indentations. A set that is not convex is called nonconvex 
(as opposed to concave, which seems better suited for lenses). 

A boundary representation of O is an m-sided polygon, which can be described 
using two kinds of features: vertices and edges. Every vertex corresponds to a 
"corner" of the polygon, and every edge corresponds to a line segment between a 
pair of vertices. The polygon can be specified by a sequence, (xi,yi), (x 2 ,y 2 ), • • ., 
{xmiUm), of m points in M 2 , given in counterclockwise order. 

A solid representation of O can be expressed as the intersection of m half- 
planes. Each half-plane corresponds to the set of all points that lie to one side 
of a line that is common to a polygon edge. Figure 3.1 shows an example of an 
octagon that is represented as the intersection of eight half planes. 

An edge of the polygon is specified by two points, such as (xi, yi) and (x 2 , 2/2)- 
Consider the equation of a line that passes through (#1,2/1) and (£2,3/2)- An 
equation can be determined of the form ax + by + c = 0, in which a,b,c G 1R 
are constants that are determined from x\, y±, x 2 , and y 2 . Let / : M, 2 — > R be 
the function given by f(x,y) = ax + by + c. Note that f(x,y) < on one side 
of the line, and f(x,y) > on the other. (In fact, / may be interpreted as a 
signed Euclidean distance from (x,y) to the line.) The sign of f(x,y) indicates a 
half plane that is bounded by the line, as depicted in Figure 3.2. Without loss of 
generality, assume that f(x,y) is defined such that f(x,y) < for all points to 
the left of the edge from (xi,yi) to (#2,2/2) (if ^ i s n °t> then multiply f(x,y) by 
-!)• 

Let fi(x,y) denote the / function derived from the line that corresponds to 
the edge from (#i,2/j) to (x i+ i,y i+ i) for 1 < i < m. Let f m {x,y) denote the line 
equation that corresponds to the edge from (x m ,y m ) to (xi,yi). Let a half plane, 
Hi, for 1 < % < m be defined as a subset of W: 

Hi = {(x, y) G W I fi(x, y) < 0}. (3.1) 

Above, Hi is a primitive that describes the set of all points on one side of the line 
fi(x,y) = (including the points on the line). 

A convex, m-sided, polygonal obstacle region, O, is expressed as 

o = #1 n # 2 n • • • n # m . (3.2) 

Nonconvex polygons The assumption that O is convex is too limited for most 
applications. Now suppose that O is a nonconvex, polygonal subset of W. In this 
case, O, can be expressed as 



O = £>i U e> 2 U • • • U O, 



(3.3) 
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Figure 3.1: A convex polygonal region can be identified by the intersection of 
half-planes. 




Figure 3.2: The sign of the f(x, y) partitions M 2 into three regions: two half planes 
given by f(x,y) < and f(x,y) > 0, and the line f(x,y) = 0. 
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in which each Oi is a convex, polygonal set that is expressed in terms of half 
spaces using (3.2). Note that Oi and Oj for i ^ j need not be disjoint. Using this 
representation, very complicated obstacle regions in W can be defined. Although 
these regions may contain multiple components and holes, if O is bounded (i.e., O 
will fit inside of a big enough rectangular box) its boundary will consist of linear 
segments. 

In general, more complicated representations of O can be defined in terms of 
any finite combination of unions, intersections, and set differences of primitives; 
however, it always possible to simplify the representation into the form given by 
(3.2) and (3.3). A set difference can be avoided by redefining the primitive. Sup- 
pose the model requires removing a set defined by a primitive ifj, that contains 1 
fi(x,y) < 0. This is equivalent to keeping all points such that fi(x,y) > 0, which 
is equivalent to —fi(x, y) < 0. This can be used to define a new primitive H[ which 
when taken in union with other sets, is equivalent to the removal of Hi. Given 
a complicated combination of primitives, once set differences are removed, the 
expression can be simplified into a finite union of finite intersections by applying 
Boolean algebra laws. 

Note that the representation of a nonconvex polygon is not unique. There 
are many ways to decompose O into convex components. The decomposition 
should be carefully selected to optimize computational performance in whatever 
algorithms that model will be used. In most cases, the components may even be 
allowed to overlap. Ideally, it seems that it would be nice to represent O with the 
minimum number of primitives, but automating such a decomposition may lead to 
an NP-hard problem. See the literature remarks at the end of this chapter. One 
efficient, practical way to decompose O is to apply the vertical cell decomposition 
algorithm, which will be presented in Section 6.2.2 

Defining a logical predicate What is the value of the previous representa- 
tion? As a simple example, we can define a logical predicate that serves as a 
collision detector. Recall from Section 2.5.1 that a predicate is a Boolean-valued 
function. Let be a predicate defined as : W — > {true , false }, which 
returns true for a point in W that lies in O, and false otherwise. For a line 
given by f(x,y) = 0, let e(x,y) denote a logical predicate that returns true if 
f( x ,y) < 0, and false otherwise. 

A predicate the corresponds to a convex polygonal region can be represented 
by a logical conjunction, 

a(x, y) = ei(x, y) A e 2 {x, y) A • • • A e m (x, y). (3.4) 

The predicate a(x, y) returns true if the point (x,y) lies in the convex polyg- 
onal region, and false otherwise. An obstacle region that consists of n convex 

1 In this section, we want the resulting set to include all of the points along the boundary. 
Therefore, < is used to model a set for removal, as opposed to <. 
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polygons can be represented by a logical disjunction of conjuncts: 

<f>(x,y) = ai(x,y) V a 2 (x,y) V • • • V a n (x,y) (3.5) 

Although more efficient methods exist, the predicate <p(x,y) can be used to check 
whether a point (x t , y t ) lies inside of O in time 0(n), in which n is the number of 
primitives that appear in the representation of O (each primitive is evaluated in 
constant time). 

Note the convenient connection between a logical predicate representation and 
a set-theoretic representation. Using the logical predicate, the unions and intersec- 
tions of the set-theoretic representation are replaced by logical OR's and AND's. 
It is well known from Boolean algebra that any complicated logical sentence can 
be reduced to a logical disjunction of conjunctions (this is often called "sum of 
products" in computer engineering). This is equivalent to our previous statement 
that O can always be represented as a union of intersections of primitives. 

Polyhedral models For a 3D world, W = M 3 , and the previous concepts can 
be nicely generalized from the 2D case by replacing polygons with polyhedra, and 
replacing half-plane primitives with half-space primitives. A boundary represen- 
tation can be defined in terms of three features: vertices, edges, and faces. Every 
face is a "flat" polygon embedded in M 3 . Every edge forms a boundary between 
two faces. Every vertex forms a boundary between three or more edges. 

Several data structures have been proposed that allow one to conveniently 
"walk" around the polyhedral features. For example, the doubly- connected edge 
list [189] data structure contains three types of records: faces, half edges, and 
vertices. Each vertex record holds the point coordinates, and a pointer to an 
arbitrary half-edge that touches the vertex. Each face record contains a pointer 
to an arbitrary half-edge on its boundary. Each face is bounded by a circular 
list of half-edges. There is a pair of directed half-edge records for each edge of 
the polyhedon. Each half-edge is shown as an arrow in Figure 3.3.b. Each half- 
edge record contains pointers to five other records: 1) the vertex from which the 
half-edge originates, 2) the "twin" half-edge, which bounds the neighboring face, 
and has the opposite direction, 3) the face that is bounded by the half edge, 4) 
the next element in the circular list of edges that bound the face, 5) the previous 
element in the circular list of edges that bound the face. One all of these records 
have been defined, one can conveniently traverse the structure of the polyhedron. 

Next consider a solid representation of a polyhedron. Suppose that O is a 
convex polyhedron, as shown in Figure 3.3. A solid representation can be con- 
structed from the vertices. Each face of O has at least three vertices along its 
boundary. Assuming these vertices are not collinear, an equation of the plane 
that passes through them can be determined of the form ax + by + cz + d = 0, in 
which a,b,c,d G M are constants. 

Once again, the function, / can be constructed, except this time / : M 3 — > R, 
and f(x, y, z) = ax + by + cz + d. Let a half space, Hi, for 1 < i < m, for all m 
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a. 



b. 



Figure 3.3: a) A polyhedron can be described in terms of faces, edges, and vertices, 
b) The edges of each face can be stored in a circular list that is traversed in 
counterclockwise order with respect to the outward normal vector of the face. 



faces of O, be defined as a subset of W: 



It is important to choose fi so that it takes on negative values inside of the 
polyhedron. In the case of a polygonal model, it was possible to consistently 
define f\ by proceeding in counterclockwise order around the boundary. In the 
case of a polyhedron, the half-edge data structure can be used to obtain for each 
face the list of edges that form its boundary in counterclockwise order. Figure 
3.3.b shows the edge ordering for each face. Note that the boundary of each face 
can be traversed in counterclockwise order. For every edge, the arrows point in 
opposite directions, as required by the half-edge data structure. The equation 
for each face can be consistently determined as follows. Choose three consecutive 
vertices, pi, p 2 , P3 (they must not be collinear) in counterclockwise order on the 
boundary of the face. Let vu denote the vector from pi to P2, and let V23 denote 
the vector from p 2 to p 3 . The cross product v = Vy± x ^23 will always yield a 
vector that points out of the polyhedron and is normal to the face. Recall that 
the vector [a b c] is parallel to the normal to the plane. If these are chosen as 
a — v[l], b — v [2], and c = v [3], then f(x, y, z) < for all points in the half space 
that contains the polyhedron. 

As in the case of a polygonal model, a convex polyhedron can be defined as 
the intersection of a finite number of half spaces, one for each face. A nonconvex 
polyhedron can be defined as the union of a finite number of convex polyhedra. 
The predicate y, z) can be defined in a similar manner, in this case yielding 
TRUE if (x, y,z) G O and FALSE otherwise. 



Hi = {(x,y,z) eW \ fi(x,y,z) <0}. 



(3.6) 
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Figure 3.4: a) Once again, / is used to partition M 2 into two regions. In this case, 
the algebraic primitive represents a disc-shaped region, b) The shaded "face" can 
be exactly modeled using only four algebraic primitives. 

3.1.2 Semi-Algebraic Models 

In both the polygonal and polyhedral models, / was a linear function. In the 
case of a semi-algebraic model for a 2D world, /, can be any polynomial with 
real-valued coefficients and variables x and y. For a 3D world, / is a polynomial 
with variables x, y, and z. The class of semi-algebraic models includes both 
polygonal and polyhedral models, which use first-degree polynomials. A point set 
determined by a single polynomial primitive is called an algebraic set; a point set 
that can be obtained by a finite number of unions and intersections algebraic sets 
is called a semi- algebraic set. 

Consider the case of a 2D world. A solid representation can be defined using 
algebraic primitives of the form 



As an example, let / = x 2 + y 2 — 4. In this case, H, represents a disc of radius 
2 that is centered at the origin. This corresponds to the set of points, (x,y), for 
which f(x,y) < 0, as depicted in Figure 3. 4. a. 

Example 3.1.1 (Gingerbread face) Consider constructing a model of the shaded 
region shown in Figure 3.4.b. Let the center of the outer circle have radius r\ and 
be centered at the origin. Suppose that the "eyes" have radius r 2 and r 3 , and are 
centered at (x 2 ,y 2 ) and (x 3 ,y 3 ), respectively. Let the "mouth" be an ellipse with 
major axis a and minor axis b, and is centered at (0, 7/4). The functions are defined 



as/i = x 2 +y 2 -r 2 , f 2 = -[(x-x 2 ) 2 +(y-y 2 ) 2 -r 2 } } f 3 = -[(x-x 3 ) 2 +(y-y 3 ) 2 -r|], 



and fi = —[x 2 /a 2 + (y — y^) 2 jb 2 — 1]. For f 2 , / 3 , and f^, the familiar circle and 
ellipse equations were multiplied by —1 to yield algebraic primitives for all points 



H={{x,y)eW\f(x,y)<0}. 



(3.7) 
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outside of the circle or ellipse. The shaded region, O, can be represented as 

o = H 1 n H 2 n H 3 n H 4 . (3.8) 



In the case of semi-algebraic models, the intersection of primitives does not 
necessarily result in a convex subset of W. In general, however, it might be 
necessary to form O be taking unions and intersections of algebraic primitives. 

For semi- algebraic models, a logical predicate, (f)(x,y), can once again be 
formed, and collision checking is still performed in time that is linear in the num- 
ber of primitives because it does not depend on the particular primitives. Note 
that it is still very efficient to evaluate every primitive: / is just a polynomial that 
is evaluated on the point (x, y, z). 

The ideas generalize easily for the case of a 3D world, obtaining algebraic 
primitives of the form 

H = {(x,y,z)eW\f(x,y,z)<0}, (3.9) 

which be used to define a solid representation of a 3D obstacle, O, and also may 
be used to construct the predicate (f>(x,y,z). 

Equations 3.7 and 3.9 are sufficient to express any model of interest. One may 
define many other primitives based on different relations, such as f(x,y) > 0, 
f(x,y) = 0, f(x,y) < 0, f(x,y) = 0, and f(x,y) ^ 0; however, most of them 
do not enhance the set of models that can be expressed. They might, however, 
be more convenient in certain contexts. To see that some primitives do not allow 
new models to be expressed, consider the following primitive 

H = {(x,y,z) G W | f(x,y,z) > 0}. (3.10) 

The right part may be alternatively represented as —f(x,y,z) < 0, and — / may 
be considered as a new polynomial function of x, y, and z. For an example that 
involves the = relation, consider the primitive 

H = {(x,y,z)eW\f(x,y,z)=0} 

It can instead be constructed as H = Hi n H 2 , in which 

H 1 = {(x,y,z)eW\f(x,y,z)<0} 

and 

H 2 = {(x,y,z)eW\ -f(x,y,z)<0}. (3.13) 

The relation < does add some expressive power if it is used to construct primi- 
tives. 2 It is needed to construct models that do not include the outer boundary 
(for example, the set of all points inside of a sphere, which does not include points 
on the sphere). These are generally called open sets, and are defined Chapter 4. 

2 An alternative, which yields the same expressivepower is still use <, but allow set comple- 
ments, in addition to unions and intersections. 
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70 S. M. LaValle: Planning Algorithms 




Figure 3.5: A polygon with holes can be expressed by using different orientations: 
counterclockwise for the outer boundary and clockwise for the hole boundaries. 
Note that the shaded part is always to the left when following the arrows. 

3.1.3 Other Models 

The choice of a model often depends on the types of operations that will be 
performed by the planning algorithm. For combinatorial planning methods, to be 
covered in Chapter 6, the particular representation is critical. On the other hand, 
for sampling-based planning methods, to be covered in Chapter 5, the particular 
representation is the problem of the collision detection algorithm, which is treated 
as a "black box" as far as planning is concerned. Therefore, the models given in the 
remainder of this section are more likely to appear in sampling-based approaches, 
and may be invisible to the designer of a planning algorithm (although it is never 
wise to forget about the representation). 

Nonconvex Polygons and Polyhedra 

The method in Section 3.1.1 required nonconvex polygons to be represented as 
a union of convex polygons. Instead, a boundary representation of a nonconvex 
polygon may be directly encoded by listing vertices in a specific order; assume 
counterclockwise. Each polygon of m vertices may be encoded by a list of the 
form (xi,yi), (2:2,2/2), ■ ■ •> ( x m,y m )- It is assumed that there is an edge between 
each (xi,yi) and (x i+ i,y i+ i), and also between (x m ,y m ) and (xi,yi). Ordinarily, 
the vertices should be chosen in a way that makes the polygon simple, meaning 
that no edges intersect. In this case, there is a well-defined interior of the polygon, 
which is to the left of every edge, if the vertices are listed in counterclockwise order. 

What if a polygon has a hole in it? In this case, the boundary of the hole 
can be expressed as a polygon, but with its vertices expressed in the clockwise 
direction. To the left of each edge will be the interior of the outer polygon, and 
the to the right is the hole, as shown in Figure 3.5 

Although the data structures are a little more complicated for three dimen- 
sions, boundary representations of nonconvex polyhedra may be expressed in a 
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Figure 3.6: Triangle strips and triangle fans can reduce the number of redundant 
points. 

similar manner. In this case, instead of an edge list, one must specify faces, edges, 
and vertices, with pointers that indicate their incidence relations. Consistent ori- 
entations must also be chosen, and holes may be modeled once again by selecting 
opposite orientations. 

3D triangles 

Suppose W = R 3 . One of the most convenient models to express is a set of trian- 
gles, each of which is specified by three points, (xi,yi,Zi), (x 2 , 1/2, 22), (x 3 ,y 3 ,z 3 ). 
This model has been popular in computer graphics because graphics acceleration 
in hardware has mainly been developed in terms of triangle primitives. It is as- 
sumed that the interior of the triangle is part of the model. Thus, two triangles 
are considered as "colliding" if one pokes into the interior of another. This model 
offers great flexibility because there are no constraints on the way in which trian- 
gles must be expressed; however, this is also one of the drawbacks. There is no 
coherency that can be exploited to easily declare whether a point is "inside" or 
"outside" of a 3D obstacle. If there is at least some coherency, then it is some- 
times preferable to reduce redundancy in the specification of triangle coordinates 
(many triangles will share the same corners). Representations that remove this 
redundancy are triangle strips, which is a sequence of triangles such that each 
adjacent pair share a common edge, and triangle fans, which is triangle strip in 
which all triangles share a common vertex. See Figure 3.6. 

Nonuniform Rational B-Splines (NURBS) 

These are used in many engineering design systems to allow convenient design 
and adjustment of curved surfaces, in applications such as aircraft or automobile 
body design. In contrast to semi-algebraic models, which are implicit equations, 
NURBS and other splines are parametric equations. This makes computations 
such as rendering easier; however, others, such as collision-detection, become more 
difficult. These models may be defined in any dimension. A brief two-dimensional 
formulation is given here. 

A curve can be expressed as 



n 



^WjPjNi^u) 




(3.14) 



i=0 
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in which Wi G Re are weights, Pi are control points. The N^k are normalized basis 
functions of degree k, which can be expressed recursively as 

N i!k (u) = -"—^-N^u) + Wi-^ Ni+lk _ l{u y (3 . 15) 

The basis of the recursion is N it0 (u) — 1 if ti < u < t i+ i, and N ij0 (u) = otherwise. 
A knot vector is a nondecreasing sequence of real values, {to, t±, . . . , t m }, that 
controls that controls the intervals over which certain basic functions take effect. 

Bitmaps 

For either W = M 2 or W = M 3 , it is possible to discretize a bounded portion of 
the world into rectangular cells that may or may not be occupied. The resulting 
model will look very similar to Example 2.2.1. The resolution of this discretization 
determines the number of cells per axis and the quality of the approximation. The 
representation may be considered as a binary image in which each "1" in the image 
corresponds to a rectangular region that contains at least some part of O, and 
"0" represents those that do not contain any of O. Although bitmaps do not have 
the elegance of the other models, they often arise in applications. One example 
is a digital map constructed by a mobile robot that explores in environment with 
its sensors. One generalization of bitmaps is a grey-scale map or occupancy grid. 
In this numerical value may be assigned to each cell, indicating quantities 

such as "the probability that an obstacle exists" or the "expected difficulty of 
traversing the cell" . The latter case is often used in terrain maps for navigating 
planetary rovers. 

Superquadrics 

Instead of using polynomials to define f\, many generalizations can be constructed. 
One popular type of model is a superquadric, which generalized quadric surfaces. 
One example is a superellipsoid, given for W = R 3 by 

(i-r + i?r))" 2 + i-r-i<o, (3.16) 

l a b ) c 

in which n\ > 2 and n 2 > 2. If n\ = n 2 = 2, an ellipse is generated. As n\ and n 2 
increase, the superellipsoid becomes shaped like a box with rounded corners. 

Generalized cylinders 

A generalized cylinder is a generalization of an ordinary cylinder. Instead of being 
limited to a line, the center axis is a continuous spine curve, (x(s),y(s), z(s)) for 
some parameter s G [0, 1]. Instead of a constant radius, a radius function r(s) 
along the spine. The value r(s) is the radius of the circle, obtained as the cross 
section of the generalized cylinder at the point (x(s),y(s), z(s)). The normal to 
the cross section plane is the tangent to the spine curve at s. 
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3.2 Rigid Body Transformations 

Any of the techniques from Section 3.1 can be used to define both the obstacle 
region and the robot. Let O refer to the obstacle region, which is a subset of W. 
Let A refer to the robot, which is a subset of R 2 or R 3 , matching the dimension 
of W. Although O remains fixed in the world, W, motion planning problems will 
require "moving" the robot, A. 

3.2.1 General Concepts 

Before giving specific transformations, it will be helpful to define them in general 
to avoid confusion in later parts when intuitive notions might fall apart. Suppose 
that the robot, A, is defined as a subset of R 2 or R 3 . A rigid body transformation is 
a function, h : A — > W, that maps every point of A into W with two requirements: 
1) the distance between any pair of points of A must be preserved, and 2) the 
orientation of A must be preserved (no "mirror images"). 

Using standard function notation, h(a) for some a G A refers to the point in 
W that is "occupied" by a. Let 

h(A) = {h(a) G R 2 | a G A}, (3.17) 

which is the image of h, indicating all points in W occupied by the transformed 
robot. 

Consider transforming a robot model. If A is expressed by naming specific 
points in R 2 , as in a boundary representation of a polygon, then each point is 
simply transformed from a to h(a) G W, and the entire model has easily trans- 
formed. However, be careful when the model is expressed with primitives, such 

as 

Hi = {a G R 2 | fi(a) < 0}, (3.18) 

which differs slightly from (3.1) because the robot is not directly defined in W, 
and also a is used to denote a point (x, y) G A. Under a transformation h, the 
half plane in W may be represented as 

h(H) = {h(a) G W | fi(a) < 0}. (3.19) 

To transform the primitive completely, however, it is better to directly name points 
in w G W, as opposed to h(a) G W. This becomes 

h(H t ) = {w G W | fi{h-\w)) < 0}, (3.20) 

in which the inverse of h appears in the right side because the original point a £ A 
needs to be recovered to evaluate f). 

Thus, sometimes the forward transformation is needed, and at other times the 
inverse is needed. Be careful! Specific samples will be given shortly that clearly 
illustrate this. 
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The coming sections will introduce families of transformations, in which some 
parameters are used to select the particular transformation. Therefore, it makes 
sense to generalize h to accept two variables: a new parameter q, along with 
a G A. The resulting transformed point, a is denoted by h(q,a), and the entire 
robot is transformed to h(q, A) C W. 

The coming material will use the following shorthand notation, which requires 
the specific h to be inferred from the context. Let h(q, a) be shorted to a(q), and 
let h(q, A) be shortened to A(q). This notation makes it appear that by adjusting 
the parameter q, the robot A travels around in W as different transformations are 
selected from the family. This is slightly abusive notation, but it is convenient. 
The expression A(q) can be considered as a set-valued function that yields the 
set of points in W that are occupied by A when it is transformed by q. Most of 
the time the notation does not cause trouble, but when it does, it is helpful to 
remember the definitions from this section, especially when trying to determine 
whether forward or inverse versions of the transformations need to be used. 

One final comment before starting: note that A, before it is transformed, is 
also a subset of W. It was written only as a subset of M 2 or IR 3 to avoid confusion 
in the discussion above. Another way to make the distinction clear is to borrow 
from mechanics [] , and give the robot a separate coordinate frame from the world. 
Thus, the robot is defined in an object frame, and the world is defined in a reference 
frame. A transformation indicates where the object frame appears with respect to 
the reference frame. When multiple bodies are covered in Section 3.3, each body 
will have its own object frame, and all bodies will be expressed with respect to 
the reference frame. 

3.2.2 2D Transformations 

Translation The robot A will be translated by using two parameters, x t , y t G M. 
From Section 3.2.1, this means that q = (x t ,yt)- The function h is defined as 
h{x,y) = (x + x t , y + yt)- A boundary representation of A can be translated by 
transforming each vertex in the sequence of polygon vertices. Each point (xi,yi) 
in the sequence is simply replaced by (xj + x t , y% + yt)- 

Now consider a solid representation of A, defined in terms of primitives. Each 
primitive of the form 

H l = {(x,y)eR 2 \f(x,y)<0} (3.21) 

is transformed to 

h(Hi) = {(x,y)eW\ f{x -x t ,y- y t ) < 0}. (3.22) 

For example, suppose the robot is a disc of unit radius, centered at the origin. It 
is modeled by a single primitive, 



A = {(x, y) E R 2 | x 2 + y 1 - 1 < 0}. 



(3.23) 
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Coordinate 
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a. Translation of the robot 



b. Translation of the frame 



Figure 3.7: For every transformation there are two interpretations. 

Suppose A is translated x t units in the x direction, and y t units in the y direction. 
The transformed primitive is 



which is the familiar equation for a disc centered at (xt,yt)- In this example, the 
inverse, h^ 1 was used, as described in Section 3.2.1. 

The translated robot is denoted as A(x t , y t ). Translation by (0, 0) is the iden- 
tity transformation, which results in .4(0,0) = A, if it is assumed that A C W 
(recall that A does not necessarily have to be initially embedded in W). It will be 
convenient to use the term degrees of freedom to refer to the maximum number of 
independent parameters that can be selected to completely characterize the robot 
in the world. If the set of allowable values for x t and y t form a two-dimensional 
subset of M 2 , then the degrees of freedom is two. 

As shown in Figure 3.7, there are two interpretations of the transformation 
of A: 1) the coordinate system remains fixed, and the A is translated; 2) A 
remains fixed and the coordinate system is translated in the opposite direction. 
The first one indicates how the transformation appears while standing at the 
origin, and the second one indicates how the transformation appears from the 
robot's perspective. Unless stated otherwise, the first interpretation will be used 
when we refer to motion planning problems because it often models a robot moving 
in a physical world. Note that numerous books cover coordinate transformations 
under the second interpretation. This has been known to cause confusion since 
the transformations may sometimes appear "backwards" from what is desired. 

Rotation The robot, A, can be rotated counterclockwise by some angle 9 e 
[0, 2n) by mapping every (x, y)&Ato (x cos 9 — y sin 9, x sin 9 + y cos 9). Using a 
2x2 rotation matrix, 



h(A) = {(x, y) e W | (x - x t f + (y- y t f - 1 < 0}, 



(3.24) 




(3.25) 
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the transformation can be written as 

a; cos 0-y sin 0\ = R , Q , fx\ ^ 2g , 

x sin 9 + y cos 6y \U J 

Using the notation of Section 3.2.1, R(9) would be h(q), for which q = 6. For linear 
transformations, such as the one defined above, recall that the column vectors 
represent the basis vectors of the new coordinate frame. The column vectors of 
R{9) are unit vectors, and their inner product (or dot product) is zero, indicating 
they are orthogonal. Suppose that the X and Y coordinate axes are "painted" 
on A. The columns of R(9) can be derived by considering the resulting directions 
of the X and Y axes, respectively, after performing a counterclockwise rotation 
by the angle 9. This interpretation generalizes nicely for rotation matrices of any 
dimension. 

Note that the rotation is performed about the origin. Thus, when defining the 
model of A, the origin should be placed at the intended axis of rotation. Using 
the semi-algebraic model, the entire robot model can be rotated by transforming 
each primitive, yielding A{9). The inverse rotation, R(—9), must be applied to 
each primitive. 

Suppose a rotation by 9 is performed, followed by a translation by x t ,y t . This 
can be used to place the robot in any desired position and orientation in W. 
Note these two transformations do not commute! If the operations are applied 
successively, each (x, y) G A is transformed to 

x cos 9 - y sin 9 + x t \ , 
x sin 9 + y cos 9 + y t J ' 

Notice that the following matrix multiplication will yield the same result for the 
first two vector components 

x cos 9 — y sin 9 + x t s 
xsin9 + ycos9 + y t | . (3.28) 

This implies that the 3x3 matrix, 






(A 




)l 


y 


H 









^cos 9 — sin 9 x^ 
T = I sin0 cos0 y t I , (3.29) 



1 

may be used to represent a rotation followed by a translation: 



'cos 9 — sin 9 x t 
T = | sin0 cos0 y t | . (3.30) 
1 



The matrix T will be referred to as a homogeneous transformation. It is important 
to remember that T represents a rotation followed by a translation (not the other 
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Figure 3.8: Any rotation in 3D can be described as a sequence of yaw, pitch, and 
roll rotations. 

way around). Each primitive can be transformed using the inverse of T, resulting 
in a transformed solid model of the robot. The transformed robot is denoted by 
A(x t , y t , 9), and in this case there are three degrees of freedom. The homogeneous 
transformation matrix is a convenient representation of the combined transforma- 
tions; therefore, it is frequently used in robotics, mechanics, computer graphics, 
and elsewhere. It is called homogeneous because over M 3 it is just a linear trans- 
formation without any translation. The trick of increasing the dimension by one 
to absorb the translational part is borrowed from projective geometry, where it 
plays an important role. 



The rigid body transformations for the 3D case are conceptually similar the 2D 
case; however, the 3D case appears more difficult because 3D rotations are signif- 
icantly more complicated than 2D rotations. 

One translates A by some x t ,yt,z t G K. by mapping every (x,y,z) G A to 
(x + Xt,y + yt,z + Zt). Primitives of the form Hi — {(x, y, z) e W \ fi(x, y,z) < 0}, 
are transformed to {(x, y, z) e W \ fi(x — x t , y — y t , z — z t ) < 0}. The translated 
robot is denoted as A(x t ,y t , z t ). 

Note that a 3D body can be independently rotated about three orthogonal 
axes, as shown in Figure 3.8. Borrowing aviation terminology, these rotations will 
be referred to as yaw, pitch, and roll: 

1. A yaw is a counterclockwise rotation of a about the Z-axis. The rotation 
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matrix is given by 

^cosa —sin a N 
Rz{oi) = | sin a cos a ] . (3.31) 
1, 

Note that the upper left entries of Rz{oc) form a 2D rotation applied to the 
XY coordinates, while the Z coordinate remains constant. 

2. A pitch is a counterclockwise rotation of (3 about the Y-axis. The rotation 
matrix is given by 

cos (3 sin/3\ 
R Y ((3) = | 10. (3.32) 
■ sin (3 cos (3 J 

3. A roll is a counterclockwise rotation of 7 about the X-axis. The rotation 
matrix is given by 

/l 

R x (i) = cos 7 -sin 7 I . (3.33) 
\0 sin 7 cos 7 



Each rotation matrix is a simple extension of the 2D rotation matrix, (3.25). For 
example, the yaw matrix, R z (a) essentially performs a 2D rotation with respect to 
the XY coordinates, while leaving the Z coordinate unchanged. Thus, the third 
row and third column of R z (a) look like part of the identity matrix, while the 
upper right portion of R z (a) looks like the 2D rotation matrix. 

The yaw, pitch, and roll rotations can be used to place a 3D body in any 
orientation. A single rotation matrix can be formed by multiplying the yaw, 
pitch, and roll rotation matrices to obtain R(a, (3, 7) = Rz{®) Ry((3) Rx(l) = 

(cos a cos (3 cos a sin (3 sin 7 — sin a cos 7 cos a sin (3 cos 7 + sin a sin 7 \ 
sin a cos (3 sin a sin (3 sin 7 + cos a cos 7 sin a sin (3 cos 7 — cos a sin 7 
— sin/3 cos (3 sin 7 cos (3 cos 7 J 

(3.34) 

It is important to note that R(a,(3, 7) performs the roll first, then the pitch, and 
finally the yaw. If the order of these operations is changed, a different rotation 
matrix would result. Be careful when interpreting the rotations. Consider the 
final rotation, yaw by a. Imagine sitting inside of a robot A that looks like an 
aircraft. If f3 — 7 = 0, then the yaw turns the plane in a way that feels like turning 
a car to the left. However, for arbitrary values of [3 and 7, the final rotation axis 
will not be vertically aligned with the aircraft because the aircraft is left in an 
unusual orientation before a is applied. The yaw rotation occurs about the Z 
axis of the world (or reference) frame, not the frame in which A is defined. Each 
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time a new rotation matrix is introduced from the left, it has no concern for the 
orientations of any axes that were used for defining A. It simply rotates every 
point in M 3 in terms of the global reference frame. 

Note that 3D rotations depend on three parameters, a, (3, and 7, whereas 2D 
rotations depend only on a single parameter, 9. The primitives of the model can 
be transformed using R(a,f3, , y), resulting in A(a, /3, 7). 

It is often convenient to determine the a, (3, and 7 parameters directly from a 
given rotation matrix. Suppose an arbitrary rotation matrix, 

(3.35) 

is given. By setting each entry equal to its corresponding entry in (3.34), equations 
are obtained that must be solved for a, (3, and 7. Note that r 2 i/r n = tana, and 
r 32/^33 = tan 7. Also, r 31 = — sin/3, and a/ rf 2 + r 33 = cos (3. Solving for each 
angle yields 

a = tan~ 1 (rn/r 2 i), (3.36) 
(3 = tan-^ria + ry - r 31 ), (3.37) 

and 

7 = tan'\r 32 /r 33 ). (3.38) 

There is a choice of four quadrants for the inverse tangent functions. How can 
the correct quadrant be determined? Each quadrant should be chosen by using 
the signs of the numerator and denominator of the argument. The numerator 
sign selects whether the direction will be to the left or right of the Y axis, and 
the denominator selects whether the direction will be above or below the X axis. 
This is the same as the atan2 function in C, which nicely expands the range of 
the arctangent to [0, 2ir). This can be applied to express (3.36), (3.37) and (3.38) 
as 

a = atan2(r 11, r 21), (3.39) 



(3 = atan2(^r 2 32 + r| 3 , -r 31 ), (3.40) 

and 

7 = atan2(r 32 ,r 33 ). (3.41) 

Note that this method assumes r 2 \ 7^ and r 33 7^ 0. 

As in the 2D homogeneous transformation matrix can be defined. For 

the 3D case, a 4 x 4 matrix is obtained that performs the rotation given by 
R(a, (3, 7), followed by a translation given by x t , y t , z t . The result is T = 

/cos a cos (3 cos a sin (3 sin 7 — sin a cos 7 cos a sin (3 cos 7 + sin a sin 7 x t \ 
sin a cos (3 sin a sin (3 sin 7 + cos a cos 7 sin a sin (3 cos 7 — cos a sin 7 y t 
— sin (3 cos (3 sin 7 cos (3 cos 7 z t 

\ 1/ 

(3.42) 
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Once again, the order of operations is critical. The matrix T in (3.42) represents 
the following sequence of transformations: 

1. Roll by 7. 

2. Pitch by f3. 

3. Yaw by a. 

4. Translation by (x t ,y t ,z t ). 

The robot primitives can be transformed, to yield A(x t , yt, z t , a, (3, 7). A 3D rigid 
body that is capable of translation and rotation therefore has six degrees of free- 
dom. 

3.3 Transformations of Kinematic Chains of Bod- 
ies 

The transformations become more complicated for a chain of attached rigid bodies. 
For convenience, each rigid body is referred to as a link. Let A\, A2, ■ ■ ■ , A m 
denote a set of m links. For each % such that 1 < % < m, link Ai is "attached" to 
link A4+1 in a way that allows Ai+i some constrained motion with respect to Ai. 
The motion constraint must be explicitly given, and will be discussed shortly. As 
an example, imagine a trailer that is attached to the back of a car by a hitch that 
allows the trailer to rotate with respect to the car. In general, a set of attached 
bodies will be referred to as a linkage. This section considers bodies that are 
atteched in a single chain. This leads to a particular linkage called a kinematic 
chain. 

3.3.1 A Kinematic Chain in M 2 

Before considering a chain, suppose Ai and A2 are two rigid bodies, each of 
which is capable of translating and rotating in W = I 2 . Since each body has 
three degrees of freedom, there is a combined total of six degrees of freedom, in 
which the independent parameters are x±, y±, 6\, x 2 , yi, and 9 2 . When bodies are 
attached in a kinematic chain, degrees of freedom are removed. 

Figure 3.9 shows two different ways in which a pair of 2D links can be attached. 
The place at which the links are attached is called a joint. In Figure 3. 9. a, a 
revolute joint is shown, in which one link is capable only of rotation with respect to 
the other. In Figure 3.9.b, a prismatic joint is shown, in which one link translates 
along the other. Each type of joint removes two degrees of freedom from the pair 
of bodies. For example, consider a revolute joint that connects A\ to A 2 . Assume 
that the point (0,0) in the model for A2 is permanently fixed to a point (x a ,y a ) 
on A\. This implies that the translation of A2 will be completely determined once 
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Figure 3.9: Two types of 2D joints: a) a revolute joint allows one link to rotate 
with respect to the other, b) a prismatic joint allows one link to translate with 
respect to the other. 



x a and y a are given. Note that x a and y a are functions of xi, y±, and 6\. This 
implies that A\ and A 2 have a total of four degrees of freedom when attached. 
The independent parameters are xi, x 2 , 9\, and 9 2 . The task in the remainder 
of this section is to determine exactly how the models of Ai, A 2 , ■ ■ ., A m are 
transformed, and give the expressions in terms of these independent parameters. 

Consider the case of a kinematic chain in which each pair of links is attached by 
a revolute joint. The first task is to specify the geometric model for each link, Ai. 
Recall that for a single rigid body, the origin of the coordinate frame determines 
the axis of rotation. When defining the model for a link in a kinematic chain, 
excessive complications can be avoided by carefully placing the coordinate frame. 
Since rotation occurs about a revolute joint, a natural choice for the origin is the 
joint between Ai and Ai-i for each i > 1. For convenience that will soon become 
evident, the X-axis is defined as the line through both joints that lie in Ai, as 
shown in Figure 3.9. For the last link, A m , the X-axis can be placed arbitrarily, 
assuming that the origin is placed at the joint that connects A m to A m -i- The 
coordinate frame for the first link, Ai, can be placed using the same considerations 
as for a single rigid body. 

We are now prepared to determine the location of each link. The position 
and orientation of link Ai in W is determined by applying the 2D homogeneous 
transform matrix (3.30), 



^cos Q\ — sin 6\ x t ' 
Ti = | sin 6 1 cos 9i y t 
1 



(3.43) 



As shown in Figure 3.10, let be the distance between the joints in Ai-±. 
The orientation difference between Ai and Ai-i is denoted by the angle 0j. Let 
Ti represent a 3 x 3 homogeneous transform matrix (3.30), specialized for link Ai 
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Figure 3.10: The coordinate frame that is used to define the geometric model for 
each Aii for 1 < 2 < m, is based on the joints that connect A% to Ai-i and Ai+i. 



for 1 < i < m, 

(cos 0i -sin 6^ Oj-A 

sinflj cos^ , (3.44) 

I ) 

which generates the following sequence of transformations: 

1. Rotate counterclockwise by 0j. 

2. Translate by a^i along the X-axis. 

The transformation Tj expresses the difference between the coordinate frame in 
which Ai was defined, and the frame in which Ai-i was defined. The application 
of Tj moves Ai from its initial frame to the frame in which Ai-i is defined. The 
application of Tj_iTj moves both Ai and Ai-i to the frame in which Ai-2 is 
defined. By following this procedure, the location of any point (x,y) on A m is 
determined by multiplying the transformation matrices to obtain 

T{T 2 ---T m . (3.45) 

Example 3.3.1 To gain an intuitive understanding of these transformations, con- 
sider determining the configuration for link A3, as shown in Figure 3.11. Figure 
3. 11. a shows a three-link chain in which A\ is at its initial configuration, and the 
other links are each offset by | from the previous link. Figure 3.11.b shows the 
frame in which the model for A3 is initially defined. The application of T 3 causes 
a rotation of 63 and a translation by a 2 . As shown in Figure 3.11.C, this places 
A3 in its appropriate configuration. Note that A2 can be placed in its initial con- 
figuration, and it will be attached correctly to A3. The application of T 2 to the 
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a) A three-link chain 




b) A3 in its initial frame 



M / 



A 1 



X<2 ' -v* Ai 

c) T 3 puts A3 in „4 2 's initial frame d) T 2 T 3 puts .A3 in A.i's initial frame 



Figure 3.11: Applying the transformation T 2 T 3 to the model of A3. In this case, 
Ti is the identity matrix. 



previous result places both A3 and A2 in their proper configurations, and A\ can 
be placed in its initial configuration. ■ 



For revolute joints, the parameters a« are treated as constants, and the 6i are 
variables. The transformed m th link is represented as A m (x t , y t , 9±, . . . , m ). In 
some cases, the first link might have a fixed location in the world. In this case, the 
revolute joints account for all degrees of freedom, yielding A m (0i, ■ ■ ■ ,0 m )- For 
prismatic joints, the are treated as variables, as opposed to the 6i. Of course, 
it is possible to include both types of joints in a single kinematic chain. 
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3.3.2 A Kinematic Chain in M 3 

As for a single rigid body, the 3D case is significantly more complicated than 2D 
due to 3D rotations. Also, several more types of joints are possible, as shown 
in Figure 3.12. Nevertheless, the main ideas from the transformations of 2D 
kinematic chains extend to the 3D case. The following steps from Section 3.3.1 
will be recycled here: 

1. The coordinate frame must be carefully placed to define the model for each 
Ai. 

2. Based on joint relationships, several parameters are measured. 

3. The parameters are used to define a homogeneous transformation matrix, 

4. The transformation of any point on link A m is given by applying the matrix 
T X T 2 ---T m . 

Consider a kinematic chain of m links in W = M 3 , in which each A% for 
1 < i < m is attached to Ai + i by a revolute joint. Each link can be a complicated, 
rigid body as shown in Figure 3.13. For the 2D problem, the coordinate frames 
were based on the points of attachment. For the 3D problem, it is convenient to 
use the axis of rotation of each revolute joint (this is equivalent to the point of 
attachment for the 2D case). The axes of rotation will generally be skew lines in 
M 3 , as shown in Figure 3.14. Let Zi refer to the axis of rotation for the revolute 
joint that holds Ai to Ai-i- Between each pair of axes in succession, let join the 
closest pair of points between Zi and Z i+1 , with the origin on Zi and the direction 
pointing towards the nearest point of Z i+i . This axis is uniquely defined if the 
Zi and Z i+ i are not parallel. The recommended coordinate frame for defining 
the geometric model for each Ai will be given with respect to Zi and Xi, which 
are given in Figure 3.14. Assuming a right-handed coordinate system, the Yi 
axis points away from us in Figure 3.14. In the transformations that will appear 
shortly, the coordinate frame given by Xi, Yi, and Zi, will be most convenient for 
defining the model for Ai. It might not always appear convenient because the 
origin of the frame may even lies outside of Ai, but the resulting transformation 
matrices will be easy to understand. 

In Section 3.3.1, each Tj was defined in terms of two parameters, aj_i and 6?j. 
For the 3D case, four parameters will be defined: di, Qi, Oj_i, and These are 
referred to as Denavit-Hartenberg parameters, or DH parameters for short [316]. 
The definition of each parameter is indicated in Figure 3.15. Figure 3. 15. a shows 
the definition of di. Note that X^i and Xi contact Zi at two different places. 
Let di denote signed distance between these points of contact. If Xi is above 
Xj_x along Zi, then di is positive; otherwise, di is negative. The parameter d; L 
is the angle between Xi and Xj_i, which corresponds to the rotation about Zi 
that moves Xj_i to coincide X^ In Figure 3.15.b, Zi is pointing outward. The 
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Revolute 

1 Degree of Freedom 




Screw 

1 Degree of Freedom 




Spherical 

3 Degrees of Freedom 




Prismatic 

1 Degree of Freedom 




Cylindrical 

2 Degrees of Freedom 




Planar 

3 Degrees of Freedom 



Figure 3.12: Types of 3D Joints 




Figure 3.14: The rotation axes of the generic links are skew lines in R 3 . 
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parameter is the distance between Zj and Zj_i; recall these are generally skew 
lines in M 3 . The parameter is the angle between Zj and Zj_i. In Figure 
3.15.d, Xi-i is pointing outward. 



Two screws The homogeneous transformation matrix T; will be constructed by 
combining two simpler transformations called screws. The transformation 



Ri 



( cos 9i 
sin 9i 


V o 



— sin 9i 
cos 9i 







1 




o\ 



di 



(3.46) 



causes a rotation of 0j about the Z i axis, and a translation of di along the Zj axis. 
Notice that the effect of i?« is independent of the ordering of the rotation by 9^ 
and the translation by di because both operations occur with respect to the same 
axis, Zj. The combined operation of a translation and rotation with respect to 
the same axis is referred to as a screw (as in the motion of a screw through a 
nut). The effect of Ri can thus be considered as a screw about Zj. The second 
transformation is 



(l 










A 





COS Cti-\ 


— sin ccj-i 










sina^i 


COS CKj_i 







V° 








1 


/ 



(3.47) 



which can be considered as a screw about the Xj_i axis. A rotation of oti-\ about 
Xi_i is followed by a translation of a^\. 



Transformation matrix The homogeneous transformation matrix, Tj, for 1 < 
i < m, is 



Qi—lRi 



( cos 0j — sin^j aj_i \ 

sin^j cosai_i cos^cosccj-i — sino;j_i — sina^idi 

sin^j sinaj.! cos 9i sin ctj_i cosa^i cosctj-idj 
1 



(3.48) 

This can be considered as the 3D counterpart to the 2D transformation matrix, 
(3.30). The following four operations are performed in succession: 

1. Translate by d { along the Z-axis. 

2. Rotate counterclockwise by 9i about the Z-axis. 

3. Translate by along the X-axis. 



4. Rotate counterclockwise by about the X-axis. 
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Matrix 


Oti-l 




n 
Vi 


di 


?i(0i) 








Oi 





T 2 (fl 2 ) 


-tt/2 





9 2 


d 2 


T 3 (^ 3 ) 





a 2 


03 


d 3 


T 4 (# 4 ) 


vr/2 




04 


d 4 


T 5 (# 5 ) 


-tt/2 





05 





T 6 (6 6 ) 


tt/2 





06 






Table 3.1: The DH parameters are shown for substitution into each homogeneous 
transformation matrix (3.48). Note that the parameters a 3 and d 3 must be written 
as negative values (they are signed displacements, not distances). 

As in the 2D case, the first matrix, 7\, is special. To represent any position 
and orientation of Ai, it could be defined as a general rigid-body homogeneous 
transformation matrix (3.42). If the first body is only capable of rotation via a 
revolute joint, then simple convention is usually followed. Let the ao, «o parame- 
ters of T\ be assigned as ao = «o = (there is no z t axes). This implies that Qq 
from (3.47) is the identity matrix, which makes T\ — R\. 

The transformation Tj gives the relationship of the frame for Ai to the frame 
for Ai-i- The position of a point (x, y, z) on A m is given by 



T{T 2 ■ ■ ■ T n 



y 

z 



(3.49) 



For each revolute joint, 0i is treated as the only variable in Tj. Prismatic joints 
can be modeled by allowing to vary. More complicated joints can be modeled as 
a sequence of degenerate joints. For example, a spherical joint can be considered 
as a sequence of three zero-length revolute joints; the joints perform a roll, a 
pitch, and a yaw. Another option for more complicated joints is to abandon the 
DH representation and directly develop the homogeneous transformation matrix. 
This might be needed to preserve topological properties that become important 
in Chapter 4. 

Example 3.3.2 (PUMA 560) This example demonstrates the 3D chain kine- 
matics on a classic robot manipulator, the PUMA 560, shown in Figure 3.16. The 
current parameterization here is based on [?, 413]. The procedure is to determine 
appropriate coordinate frames to represent each one of the links. The first three 
links allow the hand (called an end-effector) to many large movements in the W, 
and the last three enable the hand to achieve a desired orientation. There are 
six degrees of freedom, each of which arises from a revolute joint. The coordinate 
frames are shown in Figure 3.16, and the corresponding DH parameters are given 
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Figure 3.16: The Puma 560 is shown along with the the DH parameters and 
coordinate frames for each link in the chain. This figure is borrowed from [413] 
by courtesy of the authors. 



in Table 3.1. Each transformation matrix, Tj, may be considered as a function of 
9f, hence, it is written Tj(0j). The other parameters are fixed for the this example. 
Only 9i, 9 2 , . . ., 9 e are allowed to vary. 

The parameters from Table 3.1 may be substituted into the homogeneous 
transformation matrices to obtain 



Ti = 



/ cos 9\ 
sin#! 


V o 



-sin 0i 0\ 

cos9 1 

1 

1/ 



(3.50) 
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and 



T 2 = 



T 3 = 



— sin 6*2 



/ cos #3 
sin 83 



I cos 9 2 - sin 6 2 0\ 

1 d 2 

-cos9 2 

\ 1 J 

— sin #3 a 2 \ 

COS #3 

1 d 3 





T 4 = 



1/ 

a 3 









-1 






\ 



T 5 = 



-d,4 


1 J 

-sin ^ 5 0\ 



n = 



V 

/ cos ^4 — sin 64 

sin 9 4 cos # 4 

V 

/ COS 6*5 

10 

— sin 9 5 — cos 6^0 
\ l) 

/ cos 6*6 — sin 9q 0\ 
0-10 

cos 9 6 ' 

\ 01/ 

A point, (x,y, z) in the frame of the last link, Aq appears in 

y 

z 

V/ 



W as 



T 1 {e 1 )T 2 {9 2 )T 3 {e 3 )T A {9 i )T b {9 b )T & {9 6 ) 



(3.51) 



(3.52) 



(3.53) 



(3.54) 



(3.55) 



(3.56) 



Example 3.3.3 (Transforming Octane) Figure 3.17 shows a ball- and-stick model 
of an octane molecule, each "ball" is an atom, and each "stick" represents a bond 
between a pair of atoms. There is a linear chain of eight carbon atoms, and a 
bond exists between each consecutive pair of carbons in the chain. There are also 
numerous hydrogen atoms, but we will ignore them. Each bond between a pair of 
carbons is capable of twisting, as shown in Figure 3.18. Studying the configura- 
tions (called conformations) of molecules is an important part of computational 
biology. It is assumed that there are seven degrees of freedom, each of which 
arises from twisting a bond. The techniques from this section can be applied to 
represent these transformations. 
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Figure 3.17: A hydrocarbon (octane) molecule with 8 carbon atoms and 18 hy- 
drogen atoms (courtesy of the New York University Molecular Library). 




Figure 3.18: Consider transforming the spine of octane by ignoring hydrogen 
atoms and allowing the bonds between carbons to rotate. You could also construct 
this easily with Tinkertoys. If the first link is held fixed, then there are six degrees 
of freedom. The rotation of the last link is ignored. 
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Figure 3.19: Each bond may be interpreted as a "link" of length di that is alig 
with the Zj axis. Note that most of A{ appears in the negative Zj direction. 
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Note that the bonds correspond exactly to the axes of rotation. This suggests 
that Zi axes shuold be chosen to coincide with the bonds. Since consecutive bonds 
meet at atom, there is no distance between them. From Figure 3.15.C, observe 
that this will make a, = for all i. From Figure 3. 15. a, it can be seen that each d, L 
will correspond to a bond length, the distance between consecutive carbon atoms. 
See Figure ??. This leaves two angular parameters, 0j and Since the only 
possible motion of the links is via rotation of the Zj axes, the angle between two 
consecutive axes, as shown in Figure 3.15.d, must remain constant. In chemistry, 
this is referred to as the bond angle, and is represented in the DH parameterization 
as ccj. The remaining 0j parameters are the variables that represent the degrees of 
freedom. However, looking at Figure 3.15.b, observe that the example is degen- 
erate because each Xi has no frame of reference because each a« = 0. This does 
not, however, cause any problems. For visualization purposes, it may be helpful 
to replace Aj_i and X; L by Zj_i and Z i+1 , respectively. This way it easy to see 
that as the bond for Zj is twisted, the observed angle changes accordingly. Each 
bond is interpreted link, Ai. 

The origin of each Ai must be chosen to coincide with the intersection point 
of Zi and Z i+ i. Thus, most of the points in Ai will lie in the — Z, direction; see 
Figure ??. 

The next task is to write down the matrices. Attach a coordinate frame to the 
first bond, with the second atom at the origin, and the bond aligned with the Z 
axis, in the negative direction; see Figure ??. To define 7\, recall that 7\ = Ri 
from (3.46) because Qq is dropped. The parameter d\ represents the distance 
between the intersection points of Axis and Axis 2 along Axis 1. Since there is 
no Axis 0, there is freedom to choose d±; hence, let d± — to obtain 



71(0!) = R 1 (9 1 ) 



/cos 0i -sin 0i 0\ 

sin 0i cos 0i 

10 

\ 1/ 



(3.57) 



The application of 7\ to points in A\ causes them to rotate around the Zi axis, 
which appears correct. 

The matrices for the remaining six bonds are 



7X0, 



/ cos 9i — sin 0j \ 

sin0jcos«j_i cos0icoso;j_i — sinctj_i — sin ai-idi 
sin0j sino;j_i cos0j sina!j_i cos«i_i cosaj_i<ij 



V 



o 



o 







(3.58) 



/ 



for % G {2, . . . , 7}. The notation Tj(0j) indicates that 0, is the only variable. All 
other parameters of Tj are constants. The position of any point, (x,y,z) on the 
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Figure 3.20: General linkages: a) Instead of a chain of rigid bodies, a "tree" of 
rigid bodies can be considered; b) if there are loops, then parameters must be 
carefully assigned to ensure that the loops are closed. 



last link, Aj, is given by 

T 1 (0 1 )T 2 (0 2 )T 3 (0 3 )74(04)75(0 5 )T 6 (0 6 )T 7 (0 7 ) 



fx\ 

y 

z 

VJ 



(3.59) 



3.4 Transformations of Kinematic Trees 

Motivation For many interesting problems, the linkage is arranged in a "tree" 
as shown in Figure 3. 20. a. Assume here that the links are not attached in ways that 
form loops (i.e., Figure 3.20.b); that case is deferred until Section 4.4, although 
some comments are also made at the end of this section. The human body, with its 
joints and limbs attached to the torso, is an example that can be modeled as a tree 
of rigid links. Joints such as knees and elbows are considered as revolute joints. 
A shoulder joint is an example of a spherical joint, although it cannot achieve 
any orientation (without a visit to the emergency room!). As indicated by Figure 
??, there is widespread interest in animating humans in virtual environments and 
also in developing humanoid robots. Both of these cases rely on formulations of 
kinematics that mimic the human body. 

Another problem that involves kinematic trees is the conformational analysis 
of molecules. Example ?? involved a single chain; however, most organic molecules 
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a. b. 



Figure 3.21: a) This is a picture of the H7 humanoid robot and one of its de- 
velopers, S. Kagami. It was developed in the JSK Laboratory at the University 
of Tokyo, b) This is a digital actor whose motions were generated by planning 
algorithms. This was part of the Ph.D. thesis of James Kuffner at Stanford Uni- 
versity. 

are more complicated, as in the familiar drugs shown in Figure 3.22. The bonds 
may twist to give degrees of freedom to the molecule. Moving through the space 
of conformations requires the formulation of a kinematic tree. Studying these con- 
formations is important because scientists need to determine for some candidate 
drug whether or not the molecule can twist the right way so that it docks nicely 
(low energy) with a protein cavity; this induces a pharmacological effect, which 
hopefully is the desired one. Another important problem is determining how com- 
plicated protein molecules fold into certain configurations. These molecules are 
orders of magnitude larger (in terms of numbers of atoms and degrees of freedom) 
than typical drug molecules. 

Common joints for W = M. 2 First consider the simplest case in which there is 
a 2D tree of links for which every link has only two points at which revolute joints 
may be attached. This corresponds to Figure 3. 20. a. A single link is designated 
as the root, A±, of the tree. To determine the transformation of a body, Ai, in 
the tree, the tools from Section 3.3.1 are directly applied to chain of bodies that 
connects Ai to A\, while ignoring all other bodies. When determining the degrees 
of freedom of the entire tree, there will be one 9i for each link of the tree. This 
case seems quite straightforward; unfortunately, it is not this easy in general. 
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Caffeine 



Ibuprofen 



THC 



Figure 3.22: Several familiar drugs are pictured as ball-and-stick models (courtesy 
of the New York University Molecular Library). Analyzing the flexibility of these 
molecules is an important part of drug design. Note that they can be treated 
as robots made of many links. Kinematic tree and closed chain issues become 
important. 

Junctions with more than two rotation axes Now consider modeling a 
more complicated collection of attached links. The main novelty that is that one 
link may have joints attached to it in more than two locations, as in A-j from 
Figure 3.23. A link with more than two joints will be referred to as a junction. 

If there is only one junction, then most of the complications arising from 
junctions can be avoided by choosing the junction as the root. For example, for 
a simple humanoid model, the torso would be a junction. It would be sensible 
to make this the root of the tree, as opposed to the right foot, for instance. The 
legs, arms, and head could all be modeled as independent chains. In each chain, 
the only concern is that the first link will not necessarily be defined around the 
coordinate origin. The could be accounted for by inserting a fixed, fictitious link 
that connects from the origin of the torso to the attachment point of the limb. 

The situation is more interesting if there are multiple junctions. Suppose that 
Figure 3.23 represents part of a 2D system of links for which the root, A\ is 
attached to via a chain of bodies to A§. To transform link Ag, the tools from 
Section 3.3.1 may be directly applied to yield a sequence of transformations, 




(3.60) 



for a point (x, y) G Ag. Likewise, to transform T 13 , the sequence 




(3.61) 



can be used by ignoring the chain of links ,4s and Ag. So far everything seems to 
work well, but take a close look at A-j. As shown in Figure 3.24, its coordinate 
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Figure 3.23: Now it is possible for a link to have more than two joints, as in A7. 

frame was defined in two different ways, one for each chain. If both are forced to 
use the same frame, then at least one must abandon the nice conventions of Section 
3.3.1 for choosing frames. This situation becomes worse for 3D trees because this 
would suggest abandoning the DH parameterization. 

Constraining parameters Fortunately, it is fine to use different frames when 
following different chains; however, one extra piece of information is needed. Imag- 
ine transforming the whole tree. The variable 8 7 will appear twice, once from each 
of the upper and lower chains. Let 9j u and 671 denote these #'s. Can 9 really be 
chosen two different ways? This would imply that the tree is instead as pictured 
in Figure 3.25, in which there are two independently-moving links, A 7u and A71. 
To fix this problem, a constraint must be imposed. Suppose that 671 is treated as 
an independent variable. The parameter # 7m must then be chosen as 871 + 0, in 
which is shown in Figure 3.24. 

For a 3D tree of bodies the same general principles may be followed. In some 
cases, there will not be any complications that involve special considerations of 
junctions and constraints. One example of this is the transformation of flexible 
molecules because all consecutive rotation axes intersect, and junctions occur 
directly at these points of intersection. In general, however, the DH parameter 
technique may be applied for each chain, and then the appropriate constraints 
have to be determined and applied to represent the true degrees of freedom of the 
tree. 

Example 3.4.1 Figure 3.26 shows a 2D example that involves six links. To trans- 
form the only relevant links are A5, A2, and Ai- The chain of transformations 
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Figure 3.24: The junction is assigned two different frames, depending on which 
chain was followed. The solid axes were obtained from transforming Ag, and the 
dashed axes were obtained from transforming A13. 



is 



in which 



and 



T^TsTe \ y\, (3.62) 



(3.63) 





(3.64) 



(3.65) 



— sin 9 6 

cos 9 6 ) , (3.66) 

1 
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Figure 3.25: Choosing each 6> 7 independently would result in a tree that ignores 
that fact that A 7 is rigid. 



in which T 2 i denotes the fact that the lower chain was followed. The transformation 
for points in A4 is 



111 



TiT 2u T 4 T 5 y 



(3.67) 



before, and 




9 3 — sin #3 a 2 \ / cos #3 — sin # 3 

#3 ) = I sin# 3 cos# 3 











and 



( cos 64 — sin 64 a 4 \ / cos 64 — sin 64 N 
T 4 = I sin 6*4 cos ^ 4 = I sin# 4 cos 9 4 

1 / V 



(3.68) 



(3.69) 



The interesting case is 

i09.,i. -sin# 2u oA fcos(9 2 i + 7r/4) 
20s 6 2u J = j sm(6 2l + tt/4) 



fcos9 2u 
T 2u = sin 9 2u cos 2u 
1 



in 

a 



'cos(02i + 7r/4) - sin(# 2i + 7r/4) a x N 
sin(^i + 7r/4) cos(0 2 i + 7r/4) 
1 / 

'(3.70) 

which the constraint 2u = Q21 + tt/4 is imposed to enforce the fact that A 2 is 
junction. ■ 



What if there are loops? The most general case includes links that are con- 
nected in loops, as shown in Figure 3.27. These are generally referred to as dosed 



3.4. TRANSFORMATIONS OF KINEMATIC TREES 



101 




Figure 3.26: A tree of bodies in which the joints are attached in different places. 



kinematic chains. This arises in many applications. For example, with humanoid 
robotics or digital actors, a loop is formed when both feet touch the ground. An 
another example, suppose that two robot manipulators, like the Puma 560 from 
Example 3.3.2, cooperate together to carry an object. If each robot grasps the 
same object with its hand, then a loop will be formed. Furthermore, a large 
fraction of organic molecules have flexible loops. Exploring the space of their 
conformations requires careful consideration of the difficulties imposed by these 
loops. 

The main difficulty of working with closed kinematic chains is that it is hard 
to determine which parameter values are within an acceptable range to ensure 
closure. If these values are given, then the transformations are handled in the 
same way as the case of trees. For example, the links in Figure 3.27 may be 
transformed by breaking the loop into two different chains. Suppose we forget 
that the joint between A$ and Aq exists. Consider two different kinematic chains 
that start at the joint on the extreme left. There is an upper chain from A\ to A$, 
and a lower chain from A\o to Aq. The transformations for these any of bodies 
can be obtained directly from the techniques of Section 3.3.1. Thus, it is easy to 
transform the bodies, but how do we choose parameter values that ensure A$ and 
A& are connected at their common joint? Using the upper chain, the position of 
this joint may be expressed as 



T^T^Tsie^T^me,) 



I , (3.71) 



V 1 

in which (a 5 , 0) G A$ is the location of joint of A5 that is supposed to connect to 
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Figure 3.27: There are ten links and ten revolute joints arranged in a loop. This 
is an example of a closed kinematic chain. 

Aq. The position of this joint may also be expressed using the lower chain as 

T 10 (e 10 )T 9 (e 9 )T s (e 8 )T 7 (e 7 )T 6 (e 6 ) I o I , (3.72) 

with (d6, 0) representing the position of the joint in the frame of Aq. If the loop 
does not have to be maintained, then any values for 6\, . . ., #10, may be selected, 
resulting in ten degrees of freedom. However, if a loop must maintained, then 
(3.71) and (3.72) must be equal, 

7i(0i)r 2 (0 2 )r 3 (03)74(04)75(05) =T 10 (e 10 )T 9 (e g )T 8 (e 8 )T 7 (e 7 )T 6 (e e ) o 

(3.73) 

which is quite a mess of nonlinear, trigonometric equations that must be solved. 
The set of solutions to (3.73) could be very complicated. For the example, the 
total degrees of freedom is eight because two were removed by making the joint 
common. Since the common joint allows the links to rotate, only two degrees of 
freedom are lost. If A5 and Aq had to be rigidly attached, then the total degrees 
of freedom would be only seven. For most problems that involve loops, it will not 
be possible to obtain a nice parameterization of the set of solutions. The problem 
is a form of the well-known inverse kinematics problem []. 

In general, a complicated arrangement of links can be imagined in which there 
are many loops. Each time a joint along a loop is "ignored", as in the procedure 
just described, then one less loop exists. This process can be repeated iteratively, 
until there are no more loops in the graph. The resulting arrangement of links 
will be a tree for which the previous techniques of this section may be applied. 
However, for each joint that was "ignored" an equation similar to (3.73) must be 
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Figure 3.28: Loops may be opened to enable tree-based transformations to be 
applied; however, a closure constraint must still be satisfied. 

introduced. All of these equations must be satisfied simultaneously to respect the 
original loop constraints. Suppose that a set of value parameters is already given. 
This could happen, for example, using motion capture technology to measure 
the position and orientation of every part of a human body in contact with the 
ground. From this the solution parameters could be computed and all of the 
transformations are easy to represent. However, as soon as the model moves, it 
is difficult to ensure that the new transformations respect the closure constraints. 
The foot of the digital actor may push through the floor, for example. Further 
information on characterizing this complicated solution space is given in Section 



One can easily imagine motion planning for nonrigid bodies. This falls outside 
of the families of transformations studied so far in this chapter. Several kinds of 
nonrigid transformations are briefly surveyed here. 

Linear transformations Rotations are a special case of linear transformations, 
which are generally expressed by a n x n matrix, M, if the transformations are 
performed over M. n . Consider transforming points, (x,y) in a 2D robot, A, as 



If M is a rotation matrix, then the "shape" of A will remain the same. In some 
applications, however, it may be desirable to distort the shape. 



4.4. 



3.5 Nonrigid Transformations 




(3.74) 
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Figure 3.29: Shearing transformations may be performed. 



The robot can by scaled by m\\ along the X axis and m 22 along the Y axis by 
applying 



for positive real values m n and m 22 . If one of them is negated, then a mirror 
image of A is obtained. 

In addition to scaling, A can be sheared by applying 



for m 12 7^ 0. The case of m 12 = 1 is shown in Figure 3.29. 

The scaling, shearing, and rotation matrices may be multiplied together to 
yield a general transformation matrix that explicitly parameterizes each effect. 
It is also possible to extend the M from n x n to (n + 1) x (n + 1) to obtain a 
homogeneous transformation that includes translation. Also, the concepts extend 
in a straightforward way to 1R 3 and beyond. This enables the additional effects of 
scaling and shearing to be incorporated directly into the concepts from Sections 



Flexible materials In some applications there is motivation to move beyond 
linear transformations. Imagine trying to warp a flexible material, such as a 
mattress, through a doorway. The mattress could be approximated by a 2D 
array of links; however, the complexity and degrees of freedom would be too 
cumbersome. For another example, suppose that a snake-like robot is designed by 
connecting a hundred revolute joints together in a chain. The tools from Section 
3.3 may be used to transform it with 100 rotation parameters, 9\, . . ., 9\qq, but 
this may become unwieldy for use in a planning algorithm. An alternative is to 
approximate the snake with a deformable curve or shape. 

For problems such as these, it is desirable to use a parameterized family of 
curves or surfaces. Spline models are often most appropriate because these are 
designed to provide easy control over the shape of a curve through the adjustment 
of a small number of parameters. Other possibilities include generalized cylinders 
and superquadric models that were mentioned in Section 3.1.3. 




(3.75) 




(3.76) 



3.2-3.4. 
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One complication is that complicated constraints may be imposed on the space 
of allowable parameters. For example, each joint of a snake-like robot could have a 
small range of rotation. This would be easy to model using a kinematic chain; how- 
ever, determining which splines from a spline family satisfy this extra constraint 
may be difficult. Likewise for manipulating flexible materials, there are usually 
complicated constraints based on the elasticity of the material. Even determining 
its correct shape under the application of some forces requires integration of an 
elastic energy function over the material [?]. 
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Literature 

A thorough coverage of solid and boundary representations, including semi-algebraic 
models, can be found in [331]. A standard geometric modeling book from a 
CAD/CAM perspective, including NURBs models is [568]. NURB models are 
also surveyed in [628] . 

The logical predicate defined in Section 3.1 can check whether a point lies inside 
of O in 0(n) time, in which n is the number of primitives. Many algorithms exist 
that can accomplish this much more quickly. For a convex polygon, it can be 
determined whether a point lies inside or outsize in time 0(\nn) by performing 
range searching on the upper and lower chains of edges []. need more refs and 
info. 

Discussion of optimal decompositions See Suri's survey, pp. 429-444 of CRC 
Handbook on DCG. 

Theoretical algorithm issues regarding semi-algebraic models are covered in 
[558, 559]. The subject of transformations of rigid bodies and chains of bodies 
is covered in most robotics texts. Classic references include [180, 618]. The DH 
parameters were introduced in [316]. 

Need to talk about half-edge data structures, and related variations. 

There are many ways to parameterize the set of all 3D rotation matrices. The 
yaw-pitch-roll formulation was selected because it is the easiest to understand. 
There are generally 12 different variants of the yaw-pitch-roll formulation (also 
called Euler angles) based on different rotation orderings and axis selections. This 
formulation, however, it not best suited for the development of motion planning 
(sorry!) algorithms. It is the easiest (and safe) to use for making quick 3D ani- 
mations of motion planning output, but it incorrectly captures the state space for 
the planning algorithm. Section 4.2 introduces the quaternion parameterization, 
which correctly captures this state space; however, it is harder to interpret when 
constructing examples. Therefore, it is helpful to understand both. In addition to 
Euler angles and quaternions, there is still motivation for many other parameteri- 
zations of rotations, such as spherical coordinates, Cayley-Rodrigues parameters, 
and stereographic projection. Chapter 5 of [155] provides extensive coverage of 
3D rotations and different parameterizations. 

The coverage of transformations of chains of bodies was heavily influenced 
by classic robotics texts [180, 618, ?]. The standard approach in these books is 
to introduce the kinematic chain formulations and DH parameters in the first 
couple of chapters, and then move on to topics that are crucial for controlling 
robot manipulators, including dynamics modeling, singularities, manipulability, 
and control. Since this book is concerned instead with planning algorithms, we 
depart at the point where dynamics would usually be covered, and move into 
careful study of the configuration space in Chapter 4. 
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Interesting Web Pages 

NYU Molecular Library: http://www.nyu.edu/pages/mathmol/library/ 

Exercises 

1. How would you define the semi- algebraic model to remove a triangular 
"nose" from the region shown in Figure 3.4? 

2. For distinct values of yaw, pitch, and roll, is it possible to generate the same 
rotation. In other words, R(a, fi, 7) = R(a' ', f3' ,7'), if at least one of the 
angles is distinct. Characterize the sets of angles for which this occurs. 

3. Using rotation matrices, prove that 2D rotation is commutative, but 3D 
rotation is not. 

4. An alternative to the yaw-pitch-roll formulation from Section 3.2.3 is con- 
sidered here. Consider the following Euler angle representation of rotation 
(there are many other variants). The first rotation is Rzi'j), which is just 
(3.31) with a replaced by 7. The next two rotations are identical to the 
yaw-pitch-roll formulation: Ry(P) is applied, followed by Rz{a). This yields 
R eu ier(a,(3n) = R Z (a) R Y (P) Rz(l) ■ 

(a) Determine the matrix R eu i er . 

(b) Show that R euler (a, (3, 7) = R eu i er [a - 7r, -/3,7 -tt). 

(c) Suppose that a rotation matrix is given as shown in (3.35). Show that 
the Euler angles are 

a — atan2(r 23 ,r 13)1 (3.77) 
P = atan2(^/l-rl 3 ,r 3 3), (3.78) 

and 

7 = atan2(r 32 , -r 31 ). (3.79) 

5. There are 12 different variants of yaw-pitch- roll (or Euler angles), depending 
on which axes are used and the order of these axes. Determine all of the 
possibilities, using only notation such as Rz{ot)RY{f3)Rz{l) for each one. 
Give brief arguments that support why or why not specific combinations 
rotations are included in your list of 12. 

6. Suppose that A is a unit disc, centered at the origin and W = 1R 2 . Assume 
that A is represented by a single, semi-algebraic primitive, H = {(x, y) \ x 2 + 
y 2 < !}• Show that the transformed primitive is unchanged after rotation. 
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7. Consider the articulated chain of bodies shown below. There are three 
identical rectangular bars in the plane, called Ai, A 2 , A3. Each bar has 
width 2 and length 12. The distance between the two points of attachment 
is 10. The first bar, A±, is attached to the origin. The second bar A 2 is 
attached to the A\, and A3 is attached to the «4 2 - Each bar is allowed to 
rotate about its point of attachment. The configuration of the chain can be 
expressed with three angles, (#i,#2,#3)- The first angle, 9\ represents the 
angle between the segment drawn between the two points of attachment of 
Ai and the x axis. The second angle, 9 2) represents the angle between A 2 
and Ai (9 2 = when they are parallel). The third angle, # 3 represents the 
angle between ^3 and A 2 . 

c 



(0,0) 





= a 










— 


10 




12 




Suppose the configuration is (7r/4, 7r/2, — 7r/4). Use the homogeneous trans- 
formation matrices to determine the locations of points a, b, and c. Name 
the set of all configurations such that final point of attachment (near the 
end of A3) is at (0, 0) (you should be able to figure this out without using 
the matrices). 

A three-link articulated body that lives in a 2D world is shown below. The 
first link is attached at (0, 0), but can rotate. Each remaining link is attached 
to another link with a revolute joint. The second link is a rigid ring, and 
the other two links are rectangular bars. 

(0,0) 




Assume that the structure is shown in the zero configuration. Suppose that 
the structure is moved to the configuration (^1,^2,^3) — (f, § , f), i n which 
9 1 is the angle of Link 1, 9 2 is the angle of Link 2 with respect to Link 1, 
and #3 is the angle of Link 3 with respect to Link 2. Using homogeneous 
coordinate transformations, compute the position of the point at (4, 0) in 
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the figure above, when the structure is at configuration (|, |, |) (the point 
is attached to Link 3). 

9. Approximate a spherical joint as a chain of three short links that are at- 
tached by revolute joints and give the sequence of transformation matrices. 
If the link lengths approach zero, show that the resulting sequence of trans- 
formation matrices can be used to exactly represent the kinematics of a 
spherical joint. 

10. Recall Example 3.4.1. How should the transformations be modified so that 
te links are in the positions shown in Figure 3.26 precisely when 0i = for 
every revolute joint whose angle can be independently chosen. 

11. Project: Virtual Tinkertoys Design and implement a system in which 
the user can attach various links to make a 3D kinematic tree. Assume that 
all joints are revolute. The user should be allowed to change parameters and 
see the resulting positions of all of the links. 

12. Project: Virtual Human Animation Construct a model of the human 
body as a tree of links in a 3D world. For simplicity, the geometric model 
may be limited to spheres and cylinders. Design and implement a system 
that displays the virtual human, and allows the user to click on joints of the 
body to enable them to rotate. 
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Chapter 4 

The Configuration Space 



Chapter Status 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



Chapter 3 only covered how to model and transform a collection of bodies; 
however, for the purposes of planning it is important to define a whole state space. 
The state space for motion planning is a set of possible transformations that could 
be applied to the robot. This will be referred to as the configuration space, based 
on the seminal work of Lozano-Perez [507, 503, 504], who introduced this notion 
in the context of planning. The motion planning literature was further unified 
around this concept by Latombe's book [437]. Once the configuration space is 
clearly understood, many motion planning problems that appear different in terms 
of geometry and kinematics can be solved by the same planning algorithms. This 
level of abstraction is therefore very important. 

This chapter provides important foundational material that will be very useful 
in Chapters 5 to 8 and other places where planning over continuous state spaces 
occurs. Many of concepts introduced in this chapter come directly from mathe- 
matics, particularly from topology. Therefore, Section 4.1 gives a basic overview 
of topological concepts. Section 4.2 uses the concepts from Chapter 3 to define the 
configuration. After reading this, you should be able to precisely characterize the 
configuration space and understand its structure. In Section 4.3, obstacles in the 
world are transformed into obstacles in the configuration space, but it is important 
to understand that this transformation may not be explicitly constructed. The 
implicit representation of the state space is a recurring theme throughout plan- 
ning. Section 4.4 covers the important case of kinematic chains that have loops, 
which was mentioned in Section 3.4. This case is so difficult that even the space 
of transformations usually cannot explicitly characterized (i.e., parameterized). 




Ill 
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4.1 Basic Topological Concepts 
4.1.1 Topological Spaces 

Recall from basic mathematics, the concepts of open and closed intervals in the 
set of real numbersm R. An open interval, such as (0,1) includes all real numbers 
between and 1, except and 1. However, for either endpoint, an infinite sequence 
may be defined that converges to it. For example, the sequence 1/2, 1/4, . . ., 1/2*, 
converges to as % tends to infinity. This means that we can get within any small, 
positive distance from or 1, but we cannot stand exactly on the boundary of the 
interval. For a closed interval, such as [0, 1], these boundary points are included. 

The notion of an open set lies at the heart of topology. The open set definition 
that will appear here is a substantial generalization of the concept of an open 
interval. The concept will apply to a very general collection of subsets of some 
larger space. It is general enough to easily include any kind of configuration space 
that may be encountered in planning. 

A set X is called a topological space if there is a collection of subsets of X 
called open sets such that the following axioms hold: 

1. The union of a countable number of open sets is an open set. 

2. The intersection of a finite number of open set is an open set. 

3. Both X and are open sets. 

Note that in the first axiom, the union of an infinite number of open sets may be 
taken, and the result must remain an open set. This will not necessarily be true 
for closed sets. 

For the special case of X — R, the open sets include open intervals, as expected. 
Note that many sets that are not intervals are be included because taking unions 
and intersections of open intervals generates many other open sets. For example, 
the set 



which is an infinite union of intervals, is open. 

Closed sets Open sets appear directly in the definition of a topological space. 
It next seems that closed sets are needed. Suppose X is a topological space. A 
subset Cclis defined to be a closed set if and only if X\C is an open set. Thus, 
the complement of any open set is closed, and vice versa. Any closed interval, such 
as [0,1] is a closed set because its complement (— oo,0) U (l,oo) is an open set. 
For another example, (0, 1) is an open set; therefore, R \ (0, 1) = (— oo, 0] U [1, oo) 
is a closed set. The use of "(" may seem wrong in the last expression, but "[" 
cannot be used because — oo and oo do not belong to R. Thus, the use of "(" is 
just a notational quirk. 




(4.1) 
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Figure 4.1: An illustration of the boundary definition. Suppose X = M 2 , and U 
is a subset as shown. Three kinds of points appear: 1) x\ is a boundary point, 2) 
x 2 is an interior point, and 3) x 3 is an exterior point. Both X\ and x 2 are limit 
points. 

Here is a question to ponder: are all subsets of X either closed or open? 
Although it appears that open sets and closed sets are opposites in some sense, 
the answer is NO. For X — R, the interval [0, 27r) is neither open nor closed (the 
interval [2tt, oo) is closed, and (— oo,0) is open). Note that for any topological 
space, X and are both open and closed! 

Special points From the definitions and examples so far, it should seem that 
points on the "edge" or "border" of a set are important. There are several terms 
that capture where points are relative to the border. Let X be a topological space, 
and let U be any subset of X, and let x be any point in X. The following terms 
capture the position of point x relative to U (see Figure 4.1): 

• If there exists an open set, O, such that x £ O and O C U, then x is called 
an interior point of U . The set of all interior points in X is called the interior 
of U, and is denoted by int{U). 

• If there exists an open set, O, such that x E O and O C X \ O, then x is 
called an exterior point with respect to U . 

• If x is neither an interior point nor an exterior point, then it is called a 
boundary point of U . The set of all boundary points in X is called the 
boundary of U, and is denoted by dU. 

• All points in x 6 X must be one of the three above; however, another term 
is often used, even though it is redundant given the other three. If x is either 
an interior point or a boundary point, then it is called a limit point of U . 
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The set of all limit points of U is a closed set called the closure of U, and is 
denoted by cl{U). Note that cl(U) = int(U) U dU. 

For the case of X — R, the boundary points are the endpoints of intervals. 
Thus, and 1 are boundary points of intervals, (0,1), [0,1], [0,0), and (0,1]. 
Thus, U may or may not include its boundary points. All of the points in (0, 1) 
are interior points, and all of the points in [0, 1] are limit points. The motivation 
of the name "limit point" comes from the fact that such a point might be the limit 
of an infinite sequence of points. For example, is the limit point of the sequence 
generated by 1/2* for each % G A/", the natural numbers. 

There are several convenient consequences of the definitions. A closed set, C, 
contains the limit point of any sequence that is a subset of C. This implies that 
it contains all of its boundary points. The closure, cZ, always results in a closed 
set because it adds all of the boundary points to the set. On the other hand, an 
open set contains none of its boundary points. These interpretations will come in 
handy when considering obstacles in the configuration space for motion planning. 

Some examples The definition of a topological space is so general that an 
incredible variety of topological spaces can be constructed. 

Example 4.1.1 (X = R n ) We should expect that lR n for any integer n is a topo- 
logical space. This requires characterizing the open sets. An open ball, B(x,p) is 
the set of points in the interior of a sphere of radius p, centered at x. Thus 

B(x,p) = {x' G W n I \\x' - x\\ < p}, (4.2) 

in which || • || denotes the Euclidean norm (or magnitude) of x. Such sets is 
considered an open set in M. n . Furthermore, all other open sets can be expressed 
as a countable union of open balls. 1 For the case of M, note that this degenerates 
to representing all open sets as a union of intervals, which we have done so far. 

Even though it is possible to express open sets of IR n as unions of balls, we 
prefer to use other representations, with the understanding that one could revert 
to open balls if necessary. The primitives of Section 3.1 can be used to generate 
many interesting open and closed sets. For example, any algebraic primitive ex- 
pressed in the form H = {x G K n | f(x) < 0}, in which x G R ra , produces a closed 
set. Taking finite unions and intersections of these primitives will produce more 
closed sets. Therefore, all of the models from Sections 3.1.1 and 3.1.2 produce an 
obstacle region, O, that is a closed set. As mentioned in Section 3.1.2 that sets 
constructed only from primitives that use the < relation are open. ■ 

Example 4.1.2 (Subspace topology) A new topological space can easily be 
constructed from a subset of a topological space. This will be very useful in the 



Such a collection is often referred to as a basis. 
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coming sections. Let X be a topological space, and let Y C X be a subset. The 
subspace topology on Y is obtained by denning the open sets to be any subset of 
Y that can be represented as U D Y for some open set U of X. Thus, the open 
sets for Y are almost the same as for X, except the points that do not lie in Y are 
trimmed away. New subspaces can be constructed by intersecting open sets of R n 
with a complicated region defined by semi-algebraic models. This leads to many 
interesting topological spaces, some of which will appear in later in this chapter. ■ 

Example 4.1.3 (Trivial topology) For any set X, there is always one trivial 
example of a topological space that can be constructed from it. Declare that X 
and are the only open sets. Note that all of the axioms are satisfied. ■ 

Example 4.1.4 (X = {cat, dog, tree, house}) It is important to keep in mind the 
almost absurd level of generality that is allowed by the definition of a topological 
space. A topological space can be defined for any set, as long as the declared open 
sets obey the axioms. For this case, suppose that {cat} and {dog} are open sets. 
Then, {cat, dog} must also be an open set. Closed sets and boundary points can 
be even be derived for this topology once the open sets are defined. ■ 

After the last example, it seems that topological spaces are so general that not 
much can be said about them. Most spaces that are considered in topology and 
analysis satisfy more axioms. For MJ 1 and any configuration spaces that arise in 
this book, the following is satisfied: 

Hausdorff Axiom: For any distinct X\,x 2 G X, there exist open sets A\ and 
A 2 such that x 1 G A 1 , x 2 G A 2 , and A\ n A 2 = 0. 

In other words, it is possible to separate X\ and x 2 into nonoverlapping open 
sets. Think about how to do this for R n by selecting small enough open balls. Any 
topological space X that satisfies the Hausdorff axiom is referred to as a Hausdorff 
space. The manifold definition that is used in Section 4.1.2 will guarantee that 
the resulting topological space is a Hausdorff space. 

Continuous functions A very simple definition of continuity exists for topo- 
logical spaces. It nicely generalizes the definitions from standard calculus. Let 
/ : X — > Y denote a function between topological spaces X and Y. For any set 
B C Y, let the preimage of B be denoted and defined by 

f-\B) = {x e X | f(x) G B}. (4.3) 

Note that this definition does not require / to have an inverse. 

The function / is called continuous if / _1 (0) is an open set for every open 
set O C Y. Analysis is greatly simplified by this definition of continuity. For 
example, to show that the composition of functions is continuous requires only a 
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one-line argument that the preimage of the preimage will be open. Compare this 
to the cumbersome classical proof that requires a mess of 5's and e's. 

Homeomorphism: Making a donut into a coffee cup You might heard 
the expression that to a topologist, a donut and a coffee cup appear the same 2 . In 
many branches of mathematics, it is important to define when two basic objects 
are equivalent. In graph theory (and group theory), this equivalence relation 
is called a isomorphism. In topology, the most basic equivalence is based on 
homeomorphism, which allows spaces that appear quite different in most other 
subjects to be declared equivalent in topology. A donut and coffeecup (with one 
handle) are considered equivalent because both have a single hole. This notion 
needs to be made more precise! 

Suppose / : X — > Y is a bijective (1-1 and onto) function between topological 
spaces X and Y. Since / is bijective, the inverse f~ l exists. If both / and j~ x are 
continuous, then / is called a homeomorphism. Two topological spaces, X and Y, 
are said to be homeomorphic, denoted by X=Y, if there exists a homeomorphism 
between them. This is denoted by X=Y. This implies an equivalence relation on 
the set of topological spaces (verify that the reflexive, symmetric, and transitive 
properties are implied by the homeomorphism). 

Example 4.1.5 (Interval homeomorphisms) Any open interval of R is home- 
omorphic to any other interval. For example, (0, 1) can be mapped to (0, 5) by 
the continuous mapping x h- > 5x. Note that (0,1) and (0,5) are each being in- 
terpreted here as topological subspaces of R. This kind of homeomorphism can 
be generalized substantially using linear algebra. If a subset, X C R™ that can 
be mapped to another, Y C R n , via a nonsingular linear transformation, then X 
and Y are homeomorphic. For example, the rigid body transformations of the 
previous chapter were examples of homeomorphisms applied to the robot. Thus, 
the topology of the robot does not change when it is translated or rotated. (In 
this example, note that the robot itself is the topological space. This will not be 
the case for the rest of the chapter.) 

Be careful when mixing closed and open sets. The space [0, 1] is not homeomor- 
phic to (0, 1), and neither is homeomorphic to [0, 1). The endpoints cause trouble 
when trying to make a bijective, continuous function. Surprisingly, a bounded 
and unbounded set may be homeomorphic. A subset X of R™ is called bounded if 
there exists a ball B C R n such that X C B. The mapping x — > \jx establishes 
that (0,1) and (l,oo) are homeomorphic. The mapping x — > tan _1 a; establishes 
that (— 7r/2, 7r/2) and all of R are homeomorphic! ■ 

Example 4.1.6 (Topological graphs) Let X be a topological space. The pre- 
vious example can be extended nicely to make homeomorphisms look like graph 

2 I also heard a vulgar version (from a mathematician) about topologists not knowing their 
... from a hole in the ground. 
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Figure 4.2: Even though the graphs are not isomorphic, the corresponding topo- 
logical spaces may be homeomorphic due to useless vertices. The example graphs 
map into R 2 and are all homeomorphic to a circle. 




Figure 4.3: The following topological graphs map into subsets of M 2 that are not 
homeomorphic to each other. 

isomorphisms. Let a topological graph 3 be a graph for which every vertex corre- 
sponds to a point in X, and every edge corresponds to a continuous, injective 
(one-to-one) function, r : [0, 1] — > X. The image of r connects the points in X 
that correspond to the endpoints (vertices) of the edge. The images of different 
edge functions are not allowed to intersect, except at vertices. Recall from graph 
theory that two graphs, Gi(V\,Ei) and £2(^2, £2) are called isomorphic is there 
exists a bijective mapping, / : V\ 1— > V 2 such that if there is an edge between v\ 
and v[ in Gi, then there exists an edge between /(i>i) and f(v[) in G 2 - 

The bijective mapping used in the graph isomorphism can be extended to pro- 
duce a homeomorphism. Each edge in Ei is mapped continuously to its correspond 
edge in £ 2 . The mappings will nicely coincide at the vertices. Now you should see 
that two topological graphs are homeomorphic if they are isomorphic under the 
standard definition from graph theory. 4 What if the graphs are not isomorphic? 

3 In topology this is called a 1-complex [317]. 

4 Technically, the images of the topological graphs, as subspaces of X, are homeomprohic, 
not the graphs themselves. 
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There is still a chance that the topological graphss may be homeomorphic, as 
shown in Figure 4.2. The problem is that there appear to be "useless" vertices 
in the graph. By removing vertices of degree two that can be deleted without 
affecting the connectivity of the graph, the problem is fixed. In this case, graphs 
that are not isomorphic produce topological graphs that are not homeomorphic. 
This allows many distinct, interesting topological spaces to be constructed. A few 
are shown in Figure 4.3. ■ 



4.1.2 Manifolds 

In motion planning, efforts are made to ensure that the resulting configuration 
space has nice properties that reflect the true structure of the space of transfor- 
mations. One important kind of topological space, which is general enough to 
include most of the configuration spaces considered in Part II, is called a mani- 
fold. Intuitively, a manifold can be considered as a "nice" topological space that 
behaves at every point like our intuitive notion of a surface. 

Manifold definition A topological space M C M m is a manifold 5 if for every 
x G M, an open set O C M exists such that: 1) x G O, 2) O is homeomorphic to 
M ra , and 3) n is fixed for all x G M. The fixed n is referred to as the dimension of 
the manifold, M. The second condition is the most important. It states that in 
the vicinity of any point, the space behaves like M n ; we can move a small amount 
in any direction. Several simple examples that may or may not be manifolds are 
shown in Figure 4.4. 

One natural consequence will be that m > n. According to Whitney's theorem 
[], m < 2n. In other words, M 2n is "big enough" to hold any n-dimensional 
manifold. Technically, it is said that the n-dimensional manifold, M, is embedded 
in M. m , which means that an injective mapping exists from M to W" 1 (if it is not 
injective, then the topology of M could change). 

As it stands, it is impossible for a manifold to include its boundary points 
because they are not contained in open sets. A manifold with boundary can be 
defined requiring that boundary points of M are homeomorphic to half spaces of 
dimension n, which were defined for M 2 and 1R 3 in Section 3.1, and the interior 
points must be homeomorphic to M n . 

5 Manifolds that are not subsets of K m may also be denned. This requires that M is a 
Hausdorff space and is second countable, which means that there is a countable number of open 
sets from which any other open set can be constructed by taking a union of some of them. 
These conditions are automatically satisfied when assuming M C R"; thus, it avoids these extra 
complications and is still general enough for our purposes. Some authors use the term manifold 
to refer to a differentiable manifold. This requires the definition of an atlas of charts and the 
homcomorphism is replaced by diffeomorphism. This extra structure is not needed here, but 
will be introduced when it is needed in Chapter 13. 
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Figure 4.4: Some subsets of R 2 that may or may not be manifolds. 

The presentation now turns to ways of constructing some manifolds that fre- 
quently appear in motion planning. It is important to keep in mind that two 
manifolds will be considered equivalent if they are homeomorphic (recall the donut 
and coffee cup). 

Cartesian products The Cartesian product provides a convenient way to con- 
struct new topological spaces from existing ones. Suppose that X and Y are 
topological spaces. The Cartesian product, X x Y, defines a new topological 
space as follows. Every x G X and y G Y, generates a point (x, y) exists in 
X x Y. Each open set in X x Y is formed by taking the Cartesian product of 
one open set from X and one from Y. Exactly one open set exists in X x Y for 
every pair of open sets that can be formed by taking one from X and one from 
Y. No other open sets appear in X x Y; therefore, its open sets are automatically 
determined. 

A familiar example of a Cartesian product is R x R, which is equivalent to R 2 . 
In general, R n is equivalent to R x R n_1 . The Cartesian product can be taken 
over many spaces at once. For example, R x R x • • • x R = R n . In the coming 
text, interesting manifolds will be constructed via Cartesian products. 



One-dimensional manifolds R is the most obvious example of a one-dimensional 
manifold because R certainly looks like R in the vicinity of every point. The range 
can restricted to the unit interval to yield the manifold (0, 1) because they are 
homeomorphic (recall Example 4.1.5). 

Another ID manifold, which is not homeomorphic to (0, 1), is a circle, S 1 . In 
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this case R m = M 2 , and let 

S 1 = {(x, y) eM 2 \ x 2 + y 2 = 1}. (4.4) 

If you are thinking like a topologist, it should appear that this particular circle 
is not important because there are numerous ways to define manifolds that are 
homeomorphic to S 1 . For any manifold that is homeomorphic to S 1 , we will 
sometimes say that the manifold is S 1 , just represented in a different way. Also, 
S 1 will be called a circle, but this is meant only in the topological sense; it is 
homeomorphic to a circle that we learned about in high school geometry. Also, 
when referring to R, we might instead substitute (0, 1) without any trouble. 

Another way to represent S 1 will be given by identification, which is a general 
method of declaring that some points of a space are identical, although originally 
were distinct. 6 For a topological space X, let X/ ~ denote that X has been 
redefined through some form of identification. The open sets of X are redefined 
by directly applying the identification to their elements. Using identification, S 1 
can be defined as [0, 1]/ ~, in which the identification declares is that and 1 
are equivalent, denoted as ~ 1. This has the effect of "gluing" the ends of 
the interval together, forming a closed loop. To see the homeomorphism that 
makes this possible, just use polar coordinates to obtain 9 \— > (cos 27r9, sin 2tt9). 
You should already be familiar with and 2n leading to the same point in polar 
coordinates; here they are just normalized to and 1. Letting 9 run from up 
to 1, and then "wrap around" to is a convenient way to represent S 1 because it 
does not need to be curved as in (4.4). 

It might appear that identifications are cheating because the definition of a 
manifold requires it to be a subset of R m . This is not a problem because Whitney's 
theorem states that any n-dimensional manifold can be embedded in R 2n [317]. 
The identifications just cut down on the number of dimensions that are needed for 
visualization. They are also convenient in the implementation of motion planning 
algorithms. 

Two-dimensional manifolds A variety of interesting, two-dimensional mani- 
folds can be defined by applying the Cartesian product to one-dimensional mani- 
folds. The two-dimensional manifold M 2 is formed by R x R. The product IxS 1 
defines a manifold that is equivalent to an infinite cylinder. The product S 1 x 8 1 
is a manifold that is equivalent to a torus (the outer shell of a donut). 

Can any other two-dimensional manifolds be defined? See Figure 4.5. The 
identification idea can be applied to generate several new manifolds. Start with 
an open square M = (0,1) x (0,1), which is homeomorphic to R 2 . Let (x,y) 
denote a point in the plane. A flat cylinder is obtained making the identification 
[0,7/] ~ [1,|/] for all y G (0, 1), and adding all of these points to M. The result is 
depicted in Figure 4.5 by drawing arrows where the identification occurs. 



6 This is usually denned more formally and called a quotient topology. 
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Identification Name 



Notation 



Plane 



Cylinder 



Mobius band 



» 



» 



Torus 



» 



-»- 



Klein bottle 



» 



« 



x S 1 



T 2 



Projective plane 



« 



Two-sphere 8 
Figure 4.5: Some common two-dimensional manifolds. 
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A Mobius band can be constructed by taking a strip of paper and connecting 
the ends after making a 180-degree twist. This result is not homeomorphic to 
the cylinder. The Mobius band can constructed by putting the twist into the 
identification, as [0, y] ~ [1, 1 — y] for all y G (0, 1). In this case, the arrows are 
drawn in opposite directions. The Mobius band has the famous properties that 
is has only one side (trace along the paper strip with a pencil, and you will visit 
both sides of the paper) and is nonorientable (if you try to draw it in the plane, 
without using identification tricks, it will always have a twist). 

For all of the cases so far, there has been a boundary to the set. The next few 
manifolds will not have even have a boundary, even though they may be bounded. 
If you were to live in one of them, it means that you could walk forever along any 
trajectory and never encounter the edge of your universe. It might seem like the 
universe is unbounded, but it would only be an illusion. Furthermore, there are 
several distinct possibilities for the universe that are not homeomorphic to each 
other. In higher dimensions, such possibilities are the subject of cosmology, which 
is a branch of astrophysics that uses topology to characterize the structure of the 
universe. 

A torus can be constructed by performing identifications of the form [0, y] ~ 
[1, y], which was done for the cylinder, and also [x, 0] ~ [x, 1], which identifies the 
top and bottom. Note that the point (0, 0) must be included, and is identified 
with three other points. Double arrows are used in Figure 4.5 to indicate the 
top and bottom identification. All of the identification points must be added to 
M. Note that there are no twists. A funny interpretation of the resulting flat 
torus is as the universe appears for a spacecraft in some 1980s-style asteroids-like 
video games. The spaceship flies off of the screen in one direction and appears 
somewhere else, as prescribed by the identification. 

Two interesting manifolds can be made by adding twists. Consider performing 
all of the identifications that were made for the torus, except put a twist in the 
side identification, as was done for the Mobius band. This yields a fascinating 
manifold called the Klein bottle, which can be embedded in M 4 as a closed two- 
dimensional surface in which the inside and the outside are the same! (This is 
in a sense similar to that of the Mobius band.) Now suppose there are twists in 
both the sides and the top and bottom. This results in the most bizarre manifold 
yet: the real projective plane, MP 2 . The 3D version, MP 3 , happens to be one of 
the most important manifolds for motion planning! 

One extremely important two-dimensional manifold remains to be defined. Let 
S 2 denote the sphere, which can be easily defined as 

§ 2 = {(x, y, z) 6K 3 | x 2 + y 2 + z 2 = 1}. (4.5) 

Another way to define S 2 is by making the identifications shown in the last line 
of Figure 4.5. A dashed line is indicated where the equator might appear, if we 
wanted to make a distorted wall map of the earth. The poles would be at the 
upper left and lower right corners. 
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Higher-dimensional manifolds The construction techniques used for the two- 
dimensional manifolds generalize nicely to higher dimensions. Of course, R n , is 
an n-dimensional manifold. An n-dimensional torus, T n , can be made by taking 
a Cartesian product of n copies of S 1 . Note that S 1 x S 1 ^ §> 2 . Therefore, the 
notation T n is used for (S 1 )™. Different kinds of n-dimensional cylinders can be 
made by forming a Cartesian product W x T J for integers % and j such that 
i + j = n. Higher dimensional spheres can be defined as 

§ n = {x e W n+1 | Hxll = 1}, (4.6) 

in which ||a;|| denotes the Euclidean norm of x. 

Many interesting spaces can be made by identifying faces of the cube (0, l) n 
(or even faces of a polyhedron or polytope), especially if different kinds of twists 
are allowed. An n-dimensional flat real projective space can be defined in this 
way, for example. Lens spaces are an interesting family manifolds that can be 
constructed in by identification of polyhedral faces [662]. 

The standard definition of an n-dimensional real projective space, RP n , is the 
set of all lines in R n+1 that pass through the origin. Each line is considered as a 
point in RP ra . Using the definition of S™ in (4.6), note that each one of these lines 
in R n+1 intersects § n C R n+1 in exactly two places. These intersection points are 
called antipodal, which means that they are as far from each other as possible on 
S n . They are also unique for each line. If we identify all pairs of antipodal points 
of S ra , a continuous bijection can be defined between each line in R n+1 and each 
antipodal pair on the sphere. This means that the resulting manifold § n / ~ is 
homeomorphic to RP n . 

Another way to interpret this is that RP™ is just the upper half of S n , but 
with every equatorial point identified with its antipodal point. Thus, if you try 
to walk into the southern hemisphere, you will find yourself on the other side of 
the world walking north. It is helpful to visualize the special case of R 2 and S> 2 . 
Imagine warping the picture of RP 2 from Figure 4.5 from a square into a circular 
disc, with opposite points identified. This also represents MP 2 . The center of the 
disc can now be lifted out of the plane to form the upper half of § 2 . 

4.1.3 Paths and Connectivity 

At the core of motion planning is determining whether one part of reachable from 
another. In Chapter 2, one part of the space was reached from another by applying 
a sequence of actions. For a continuous state space, we would need a continuum 
of actions. The application of the continuum of actions produces a path in the 
state space. This will be formalized in Part IV, but the short explanation is that 
the path is obtained through the integration of a vector field that is derived from 
the plan. Here now consider the effect of a plan, which is the continuum of states 
visited. Therefore, the notion of a continuous path will become very important. 
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Paths Let X be a topological space, which for our purposes will also be a 
manifold. A path, r, in X is a continuous function, r : [0, 1] — > X. Other intervals 
of K. may alternatively be used for the domain of r. Note that a path is a function, 
not a set of points. Each point along the path is given by r(s) for some s G [0, 1]. 
This makes it appear as a nice generalization to the sequence of states visited, 
when a plan from Chapter 2 is applied. Recall in that countable set of 

stages was defined, and the states visited could be represented as x\, X2, .... In 
the current setting r(s) is used, in which s replaces the stage index. To make 
connection clearer, we could use x instead of r, to obtain x(s) for each s e [0, 1]. 

Connected vs. path connected A topological space, X, is said to be con- 
nected if it cannot be represented as the union of two disjoint, nonempty, open 
sets. While this definition is rather elegant and general, if X is connected, it does 
not imply that a path exists between any pair of points in X thanks to crazy 
examples like the topologist's sine curve: 

X = {(x,y) e R 2 | x = or y = sm(l/x)}. (4.7) 

The sin(l/a;) part creates oscillations near the Y axis in which the frequency tends 
to infinity. After union is taken with the Y axis, this space is connected, but there 
is no path that reaches the Y axis from the sine curve. 

How can we avoid such problems? The standard way to fix this is to use the 
path definition directly in the definition of connectedness. A topological space, 
X, is said to be path connected if for all x±,X2 € X, there exists a path, r, such 
that r(0) = X\ and r(l) = x 2 - It can be shown that if X is path connected, then 
it is also connected in the sense defined previously. 

Another way to fix it is to make restrictions on the kinds of topological spaces 
that will be considered. This approach will be taken here by assuming that all 
topological spaces are manifolds. In this case, no strange things like (4.7) can 
happen 7 , and the definitions of connected and path connected coincide []. There- 
fore, we will just say a space is connected. However, it is important to remember 
that this definition of connected is sometimes inadequate, and one should really 
say that X is path connected. 

Simply connected Now that the notion of connectedness has been established, 
the next step is to express different kinds of connectivity. This may be done by 
using the notion of homotopy, which can intuitively be considered as a way to 
continuously "warp" or "morph" one path into another, as depicted in Figure 
4.6.a. 

Two paths T\ and r 2 are called homotopic (with endpoints fixed) if there exists 
a continuous function h : [0, 1] x [0, 1] — > X such that the following four conditions 

7 Thc topologist's sine curve is not a manifold because all open sets that contain the point 
(0, 0) contain some of the points from the sine curve. These open sets are not homeomorphic to 
R. 
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Figure 4.6: a) Homotopy continuously warps one path into another, b) The image 
of the path cannot be continuously warped over a hole in M 2 because it causes a 
discontinuity. In this case, the two paths are not homotopic. 



are met: 



and 



h(s,0) 


= n(s) 


for all s E [0,1], 


(4.8) 


h(s,l) 




for all s E [0,1], 


(4.9) 


h(0,t) -- 


= h{0,0) 


for all t E [0,1], 


(4.10) 


h(l,t)~- 


= h(l,0) 


for all t E [0,1]. 


(4.11) 



The parameter t can be interpreted as a knob that is turned to gradually deform 
the path from T\ into t 2 . The value t = yields T\ and t — 1 yields r 2 . 

During the warping process, the path image will not not allowed to jump over 
certain kinds of holes, such as the one shown in Figure 4.6.b. The key to preventing 
homotopy from jumping over some holes is that h must be continuous. In higher 
dimensions, however, there are many different kinds of holes. For the case of M 3 , 
for example, suppose the space is like a block of Swiss cheese that contains air 
bubbles. Homotopy can easily go around the air bubbles, but it will not be able to 
pass through a hole that is drilled through the entire block of cheese. Air bubbles 
and other kinds of holes that appear in higher dimensions can be characterized 
by generalizing homotopy to the warping of surfaces, as opposed to paths. 

It is straightforward to show that homotopy defines an equivalence relation 
on the set of all paths from some X\ E X to some x 2 € X. The resulting notion 
of "equivalent paths" appears frequently in motion planning, control theory, and 
many other contexts. Suppose that X is path connected. If all paths fall into 
the same equivalence class, then X will be called simply-connected. Otherwise, X 
will be called multiply-connected. The case of multiply-connected spaces is very 
interesting. SAY SOMETHING ABOUT CONTRACTIBLE SPACES? 
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The fundamental group The equivalence relation induced by homotopy starts 
to enter the realm of algebraic topology, which is a branch of mathematics that 
characterizes the structure of topological spaces in terms of algebraic objects, 
such as groups. These resulting groups have important implications for motion 
planning. Therefore, a brief overview is given here. 

At the highest level of abstraction, the task is often considered as a mapping 
between the category of all topological spaces and a category of some algebraic 
objects, such as all groups. The fundamental group is the simplest of these map- 
pings to explain. It is often denoted as 7Ti(X), which is the fundamental group 
(first homotopy group) associated with a topological space, X. Let a (continuous) 
path for which /(0) = /(l) be called a loop. Let some xt G X be designated as 
a base point. For some arbitrary but fixed based point, Xt, consider the set of all 
loops such that /(0) = /(l) = x t . This can be made into a group by defining the 
following binary operation. Let r x : [0, 1] — > X and r 2 : [0, 1] — > X be two loop 
paths with the same base point. Their product r = T\ o t 2 is defined as 



This results in a continuous loop path because T\ always terminates at x t , and r 2 
always begins at x t . In a sense, the two paths are concatenated end-to-end. 

Suppose now that the equivalence relation induced by homotopy is applied to 
the set of all loop paths through a fixed point, Xt- It will no longer be important 
which particular path was chosen from a class; any representative may be used. 
The equivalence relation also applies when the set of loops is interpreted as a 
group. The group operation actually occurs over the set of equivalences of paths. 

Consider what happens when paths from two equivalence classes are combined 
using o. Is the resulting path homotopic to either of the first two? Is the result- 
ing path homotopic if the original two are from the same homotopy class? The 
answers in general are NO and NO. The groups that result provide an interesting 
characterization of the connectivity of a topological space. Since these groups are 
based on paths, there is a nice connection to motion planning. 

Example 4.1.7 (A simply-connected space) Suppose that a topological space, 
X, is simply connected. In this case, all loop paths from a based point, x t , are 
homotopic, resulting in one equivalence class. The result is ^i(X) = 1 G , which 
just contains the identity element. ■ 

Example 4.1.8 (The circle) Suppose X = S 1 . In this case, there is an equiva- 
lence class for each j G Z, the set of integers. Here is one possible definition. If 
% > 0, then it means that the path winds % times around S 1 in the counterclock- 
wise direction and then returns to the x t . If i < 0, then the path winds around 
% times in the clockwise direction. If i = 0, then the path is equivalent to one 
that remains at the base point. The fundamental group is Z, with respect to the 




(4.12) 
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a. b. c. 



Figure 4.7: An illustration of why ^(MP 2 ) = Z 2 . (a) Two paths are shown 
that are not equivalent. The integers 1 and 2 indicate precisely there the path 
continues when it reaches the boundary, (b) A path that winds around twice, (c) 
This is homotopy to a loop path that does not wind around at all, as shown in a. 
Eventually, the part of the part that appears at the bottom is pulled through the 
top. 



operation of addition. If T\ travels %\ times counterclockwise, and r 2 travels i 2 
times counterclockwise, then r = T\ o t 2 belongs to the class of loops that travel 
around i\ + i 2 times counterclockwise. Think about additive inverses. If a path 
travels 7 times around S 1 , and it is combined with a path that travels 7 times in 
the opposite direction, the result will be homotopic to a path that never leaves 
the base point. Thus, vri(S 1 ) = Z. ■ 



Example 4.1.9 (The torus) For the torus, tt^T") = Z n , which the i th com- 
ponent of Z™ corresponds to the number of times a loop path wraps around the 
jth com ponent of T n . This makes intuitive sense since T n is just the Cartesian 
product of n circles. The fundamental group Z ra will be obtained if we start with a 
simply connected subset of the plane and drill out n disjoint, bounded holes. This 
situation arises frequently in motion planning when a mobile robot must avoid 
colliding with n disjoint obstacles. ■ 



By now it seems that the fundamental group simply keeps track of how many 
times a path loops around holes. This next example yields some very bizarre 
behavior that helps illustrate some of the interesting structure arises in algebraic 
topology. 

Example 4.1.10 (RP 2 ) Suppose X = MP 2 , the projective plane. In this case, 
there are only two equivalence classes. All paths that "wrap around" an even 
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number of times are nomotopic. Likewise, all paths that wrap around an odd 
number of times are homotopic. This strange behavior is illustrated in Figure 4.7. 
The resulting fundamental group therefore has only two elements, 7Ti(MP 2 ) = Z2, 
the cyclic group of order 2, which corresponds to addition mod 2. This makes 
intuitive sense because the group keeps track of whether a sum of integers is odd 
or even, which in this application corresponds to the total number of windings 
around MP 2 . The fundamental group is the same for MP 3 , which will be seen m 
Section 4.2.2 to be homeomorphic to the set of 3D rotations. 

Thus, there are surprisingly only two path classes for the set of 3D rotations. ■ 

Unfortunately, even if two topological spaces are not homeomorphic, their 
fundamental groups may be identical. For example, Z is the fundamental group 
of S 1 , the cylinder, IxS 1 , and the Mobius band. In the last case, the fundamental 
group does not care that there is a "twist" in the space. Another problem is that 
spaces with interesting connectivity may be declared as simply connected. The 
fundamental group of the sphere, S 2 , is just 0, the same as for M 2 . Try envisioning 
loop paths on the sphere; it can be seen that they all fall into one equivalence class. 
The fundamental group will also neglect bubbles in M 3 because the homotopy 
can warp paths around them. (Note that this space is even considered simply 
connected by our definition.) This last problem can be fixed by defining second- 
order homotopy groups. For example, a continuous function, [0, 1] x [0, 1] — > X, 
of two variables can be used instead of a path. The resulting homotopy generates 
a kind of sheet or surface that can be warped through the space, to yield a 
homotopy group ^(X) that will wrap around bubbles in M, producing a different 
group. This idea can be extended beyond beyond two dimensions to detect many 
different kinds of holes in higher dimensional spaces. This leads to the higher-order 
homotopy groups. A stronger concept than simply connected for a space is that 
its homotopy groups of all orders are equal to the identity group. This prevents 
all kinds of holds from occuring, and implies this that a space, X, is contractible, 
which means a homotopy can constructed that shrinks X to a point [317]. In 
many motion planning contexts, this notion may be a preferable substitute for 
simply-connected. 

An alternative to basing groups on homotopy is to derive them using homology, 
which is based on the structure of cell complexes instead of homotopy mappings. 
This subject is much more complicated to present, but is much more powerful for 
proving topology theorems. See the literature overview at the end of the chapter 
for suggested further reading on algebraic topology. 

4.2 Defining the Configuration Space 

This section defines the manifolds that arise from the transformations of Chapter 
3. For each robot a set of transformations can be made. If the robot has n de- 
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grees of freedom, this leads to a manifold of dimension n called the configuration 
space or C-space. It will be generally denoted by C. In the context of this book, 
the configuration space may be considered as a special form of state space. To 
solve a motion planning problem, algorithms must conduct a search in this space. 
The configuration space notion provides a powerful abstraction that converts the 
complicated models and transformations of Chapter 3 into the general problem 
of computing a path in a manifold. By developing algorithms directly for this 
purpose, they apply to a wide variety of different kinds of robots and transforma- 
tions. In Section 4.3 the problem will be complicated by bringing obstacles into 
the confutation space, but in this section there will be no obstacles. 

4.2.1 2D Rigid Bodies: SE{2) 

Section 3.2.2 expressed how to transform a rigid body in R 2 by a homogeneous 
matrix, T, given by (3.30). The task in this chapter is to characterize the set of 
all possible rigid body transformations. Which manifold will this be? Here is the 
answer and brief explanation. Since any Xt,yt G IR can be selected for translation, 
this alone yields a manifold Mi = R 2 . Independently, any rotation, 9 G [0, 2n), can 
be applied. Since 2n yields the same rotation as 0, they can be identified, which 
makes the set of 2D rotations into a manifold, M 2 = S 1 . To obtain the manifold 
that corresponds to all rigid body motions, simply take C = Mi x M 2 = I 2 x S 1 . 
The answer to the question is that the C-space is a kind of cylinder. 

Now a more detailed technical argument will be given. The main purpose 
is that such a simple, intuitive argument will not work for the 3D case. Our 
approach is to introduce some of the technical machinery here for the 2D case, 
which is easier to understand, and then extend it to the 3D case in Section 4.2.2. 

Groups The first step is to consider the set of transformations as a group, in 
addition to a topological space. 8 A group is a set, G, together with a binary 
operation, o, such that the group axioms are satisfied: 

1. (Closure) For any a,b G G, the product x o y G G. 

2. (Associativity) For all a, b, c G G, (aob) oc = ao (poc). Hence, parentheses 
are not needed, and the product may be written as a o b o c. 

3. (Identity) There is an element e G G, called the identity, such that for all 
a E G, e o a = e and a o e = e. 

4. (Inverse) For every element a G G, there is an element a -1 , called the 
inverse of a, for which a o a -1 = e and a -1 x a = e. 

8 The groups considered in this section are actually Lie groups because they are differentiable 
manifolds. We will not use the name here, however, because the notion of a differentiable 
strutucture was not defined. Readers familar with Lie groups, however, will be recongize most 
of the coming concepts. 
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Here are some simple examples. The set of integers, Z, is a group with respect 
to addition. The identity is 0, and the inverse of each % is —%. The set, Q \ 0, of 
rational numbers with removed, is a group with respect to multiplication. The 
identity is 1, and the inverse of every element, q is 1/q (0 was removed to avoid 
division by zero). 

Matrix groups Groups will now be derived from sets of matrices, ultimately 
leading to SO(n), the group of n x n rotation matrices, which is very important for 
motion planning. The set of all nonsingular nxn real- valued matrices is called the 
general linear group, denoted by GL(n), with respect to matrix multiplication. 
Each matrix A G GL(n) has an inverse A^ 1 e GL(n), which when multiplied 
yields the identity matrix, AA^ 1 = I. The matrices must be nonsingular for the 
same reason that was removed from Q. The analog of division by zero for matrix 
algebra is the inability to invert a singular matrix. 

Many interesting groups can be formed from one group, Gi, by removing some 
elements to obtain a subgroup, G 2 - To be a subgroup, G2 must be a subset of G±, 
and must satisfy the group axioms. By constructing subgroups, we will arrive at 
the set of rotation matrices. One important subgroup of GL(n) is the orthogonal 
group, 0(n), which is the set of all matrices, A e GL(n) for which AA T = I, 
in which A T denotes the matrix transpose of A. Note that matrices will have 
orthogonal columns (the inner product of any pair is zero) and the determinant 
will be 1 or —1. This can be seen by observing that AA T takes the inner product 
of every pair of columns. If the columns are different, the result must 0; if they 
are the same, the result is 1 because the AA T = I. The special orthogonal group, 
SO(n), is the subgroup of 0(n), in which every matrix has determinant 1. Another 
name for SO(n) is the group of n- dimensional rotation matrices. 

A chain of groups, SO(n) < 0(n) < GL(n), has been described in which < 
denotes "a subgroup of" . These can also be considered as topological spaces. The 
set of all n x n matrices (which is not a group with respect to multiplication) 
with real-valued entries is homeomorphic to M. n because there are n 2 entries in 
the matrix that can be independently chosen. For GL(n), singular matrices are 
removed, but a n 2 -dimensional manifold is still obtained. For 0(n), the expression 
AA T = I corresponds to n 2 algebraic equations that have to be satisfied. This 
should substantially drop the dimension. Note, however, that many of the equa- 
tions are redundant (pick your favorite value for n, multiply the matrices, and 

see what happens). There are only ways (pairwise combinations) to take 

the inner product of pairs of columns, and there are n equations that require the 
magnitude of each column to be 1. This yields a total of n(n + l)/2 independent 
equations. Each independent equation drops the manifold dimension by one, and 
the resulting dimension of 0{n) is n 2 — n{n + l)/2 = n(n — l)/2, which is easily 

remembered as ( U J . To obtain SO(n), the constraint det A = 1 is added, which 
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eliminates exactly half of the elements of 0(n), but keeps the dimension the same. 

Example 4.2.1 It is helpful to illustrate the concepts for n — 2. The set of all 

2x2 matrices may be denoted by 

" h 1 for which o,i),c,(!Gl[, (4.13) 



c d 

and is homeomorphic to M 4 . The group GL(2) is formed from the set of all 
nonsingular 2x2 matrices, which introduces the constraint that ad — bc^ 0. The 
set of singular matrices forms a 3D manifold with boundary in IR 4 , but all other 
elements of M 4 are in GL(2); therefore, GL{2) is a four dimensional manifold. 
Next, the constraint AA T = I is enforced to obtain 0(2). This becomes 



a b\ fa c\ _ (\ 
c d [b d ~ lo 1 



(4.14) 



which directly yields four algebraic equations 

a 2 + b 2 = l (4.15) 

ac + bd = (4.16) 

ca + db = (4.17) 

c 2 + d 2 = l. (4.18) 

There are two kinds of equations. There is ( ^ ) =1 equation, (4.16), that forces 



2 / 

the inner product of the columns to be 0. There are n = 2 other constraints, (4.15) 
and (4.18), which force the columns to be unit vectors. The resulting dimension 

of the manifold is = 1 because we started with M 4 and lost three dimensions 

from (4.15), (4.16), and (4.18). What does this manifold look like? Imagine that 
there are two different two-dimensional unit vectors, (a, b) and (c, d). Any value 
can be chosen for (a, b) as long as a 2 + b 2 = 1. This looks like S 1 , but the inner 
product of (a, b) and (c, d) must also be 0. Therefore, for each value of (a, b), there 
are two choices for b and d: 1) c = b and d = —a, or 2) c = —b and d = a. It 
appears that there are two circles! The manifold is S 1 US 1 , in which U denotes 
the union of disjoint sets. Note that this manifold is not connected because no 
path exists from one circle to the other. 

The final step is to require that det A — ad — be — 1, to obtain SO(2), the set 
of all 2D rotation matrices. Without this condition, there would be matrices that 
produce a rotated mirror image of the rigid body. The constraint simply forces 
the choice for c and d to be c = — b and a = d. This throws away one of the circles 
from 0(2), to obtain a single circle for 50(2). We have finally obtained what you 
already knew: 50(2) is homeomorphic to S 1 . The circle can be parameterized 
using polar coordinates to obtain the standard 2D rotation matrix, (3.25), given 
in Section 3.2.2. ■ 
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Special Euclidean group Now that the group of rotations, SO(n), is charac- 
terized, the next step is to allow both rotations and translations. This corresponds 
to the set of all (n + 1) x {n + 1) transformation matrices of the form 

for which R e SO(n) and v E M n | . (4.19) 

This should look like a generalization of (3.44) and (3.48), which were for n = 2 
and n — 3, respectively. The R part of the matrix achieves rotation of an n- 
dimensional body in W 1 , and the v part achieves translation of the same body. 
The result is a group, SE(n), which is called the special Euclidean group. As a 
topological space, SE{n) is homeomorphic to W 1 x SO(n), because the rotation 
matrix and translation vectors may be chosen independently. In the case of n — 2, 
this means SE(2) is homeomorphic to R 2 x S 1 , which verifies what was stated at 
the beginning of this section. Thus, the C-space is 

C^R 2 x S 1 (4.20) 

for the case of an unconstrained rigid body. 

Interpreting the C-space It is important to consider the topological impli- 
cations of C. Since S 1 is multiply connected, Ix S 1 and M. 2 x S 1 are multiply 
connected. It is difficult to visualize C because it is a three-dimensional manifold; 
however, there is a nice interpretation using identification. Start with the open 
unit cube, (0, l) 3 C M 3 . Add in the boundary points of the form (x,y,0), and 
make the identification (x,y, 0) ~ (x,y,l) for all x, y G (0,1). This means that 
when traveling in the X and Y directions, there is an "edge" to the configuration 
space; however, traveling in the Z direction will cause a wraparound. 

It is very important for a motion planning algorithm to understand this this 
wraparound exists. For example, consider RxS 1 because it is easier to visualize. 
Imagine a path planning problem for which C=M. x S 1 , as depicted in Figure 4.8. 
Suppose the top and bottom are identified to make a cylinder, and there is an 
obstacle across the middle. Suppose the task is to find a path from to q g . If 
the top and bottom were not identified, then it would not be possible to connect 
qi to q g ; however, if the algorithm realizes it was given a cylinder, the task is 
straightforward. In general, it is very important to understand the topology of C; 
otherwise, potential solutions will be lost. 

The next section addresses SE(n) for n — 3. The main obstacle is determining 
the topology of 5*0(3). At least we do not have to go beyond n = 3 in this book. 

4.2.2 3D Rigid Bodies: SE(S) 

One might expect that defining C for a 3D rigid body is an obvious extension of the 
2D case; however, 3D rotations are significantly more complicated. The resulting 
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Figure 4.8: A planning algorithm may have to cross the identification boundary 
to find a solution path. 



C-space will be a six-dimensional manifold, C=IR 3 x MP 3 . Three dimensions come 
from translation and three more from rotation. 

The main quest in this section is to determine the topology of SO (3). In Sec- 
tion 3.2.3, yaw, pitch, and roll were used to generate rotation matrices. These 
angles were very convenient for visualization, performing transformations in soft- 
ware, and also for deriving the DH parameters. However, these were concerned 
with a single rotation, whereas the current problem is to characterize the set of 
all rotations. It is possible to use a, j3, and 7 to parameterize the set of rotations, 
but it causes serious troubles. There are some cases in which nonzero angles yield 
the identity rotation matrix, which is equivalent to a = (3 = 7 = 0. There are 
also cases in which a continuum of values for yaw, pitch, and roll angles yield the 
same rotation matrix. These problems destroy the topology, which causes both 
theoretical and practical difficulties in motion planning. 

Consider applying the matrix group concepts from Section 4.2.1. The general 
linear group GL{3) is homeomorphic to R 9 . The special orthogonal group, 0(3), 

is determined by imposing the constraint AA T = I. There are = 3 indepen- 
dent equations that require distinct columns to be orthogonal, and 3 independent 
equations that force the magnitude of each column to be 1. This means that 
0(3) will have three dimensions, which matches our intuition since there were 
three rotation parameters in Section 3.2.3. To obtain SO(3), the last constraint, 
det A = 1, is added. Recall from Example 4.2.1 that S*0(2) consists of two circles, 
and the constraint det A = 1 selects one of them. In the case of 0(3), there will 
be two three-spheres, S 3 U S 3 , and det A = 1 selects one of them. However, there 
is one additional complication: antipodal points on these spheres generate the 
same rotation matrix. This will be seen shortly when quaternions are used to 
parameterize SO (3). 



Using complex numbers to represent SO(2) Before introducing quater- 
nions to represent 3D rotations, consider using complex numbers to represent 2D 
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rotations. Let the term unit complex number refer to any complex number, a + hi 
for which a 2 + b 2 = 1. 

Note that the set of all unit complex numbers forms a group under multipli- 
cation. It will be seen that it is "the same" group as 5*0(2) . This idea needs to 
be made more precise. Two groups, G and H, are considered "the same" if they 
are isomorphic, which means that there exists a bijective function / : G — > H 
such that for all a, b e G, /(a) o f(b) = /(a o b). This means that we can perform 
some calculations with G for a while, map the result to H, perform more calcu- 
lations, and map back to G without any trouble. The groups G and H are just 
two different representations of the same thing. 

This is true of the unit complex numbers and SO (2). To see this clearly, recall 
that complex numbers can be represented in polar form as re 10 ; a unit complex 
number is simply e l9 . A bijective mapping can be made between 2D rotation 
matrices and unit complex numbers by letting e %e correspond to the rotation 
matrix (3.25). 

If complex numbers are used to represent rotations, it is important that they 
behave algebraically in the same way. If two rotations are combined, the matrices 
are multiplied. The equivalent operation will be multiplication of complex num- 
bers. Suppose that a 2D robot is rotated by 9±, followed by 9 2 . In polar form, the 
complex numbers are multiplied to yield e l(?1 e^ 2 = e l ^ 1+d2 \ which clearly repre- 
sents a rotation of 6>i + 9 2 . If the unit complex number is represented in Cartesian 
form, then the rotations corresponding to a\ + b\% and a 2 + b 2 i are combined to 
obtain (a\a 2 — b\b 2 ) + {a\b 2 + a 2 bi)i. Note that we did not use complex numbers 
to express the solution to a polynomial equation; we simply borrowed their nice 
algebraic properties. At any time, a complex number a + bi can be converted into 
the equivalent rotation matrix 



Recall that only one independent parameter needs to be specified because a 2 + 
b 2 = 1. Hence, it appears that the set of unit complex numbers is that same 
manifold as SO (2), which is the circle, 8 1 (recall, that "same" means in the sense 
of homeomorphism) . 

Quaternions The manner in which complex numbers were used to represent 2D 
rotations will now be adapted to using quaternions to represent 3D rotations. Let 
HI represent the set of quaternions, in which each quaternion, /i e His represented 
as h = a + bi + cj + dk, and a, b,c,d Gl. A quaternion can be considered as a four- 
dimensional vector. The symbols %, j, and k, are used to denote three "imaginary" 
components of the quaternion. The following relationships are defined: i 2 = 
j 2 = k 2 = — 1, ij = k, jk = i, and ki = j. Using these, multiplication of two 
quaternions, h\ = a± + b±i + c\j + d\k and h 2 = a 2 + b 2 i + c 2 j + d 2 k, can be derived 




(4.21) 
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Figure 4.9: Any 3D rotation can be considered as a rotation by an angle 9 about 
the axis given by the unit direction vector v = [v-y v 2 v 3 ] . 



to obtain h\ ■ h 2 = a 3 + b 3 i + c^j + of 3 /c, in which 

a 3 = aia 2 - hb 2 - c\c 2 - d x d 2 (4.22) 

6 3 = a x b 2 + a 2 bi + c x d 2 - c 2 d x (4.23) 

c 3 = aic 2 + a 2 ci + b 2 d\ - b x d 2 (4.24) 

ds — a±d 2 + a 2 d\ + bic 2 — b 2 c\. (4.25) 

Using this operation, it can be shown that H is a group with respect to quaternion 
multiplication. Note, however, that the multiplication is not commutative! This 
was also true of 3D rotations; there must be a good reason. 

For convenience, quaternion multiplication can be expressed in terms of vector 
multiplications, a dot product, and a cross product. Let v — [b c of] be a three 
dimensional vector that represents the final three quaternion components. The 
first component of hi ■ h 2 is a\a 2 — v\ • v 2 . The final three components are given 
by the three-dimensional vector a\v 2 + 02^1 — v\ x v 2 . 

Just as unit complex numbers were needed for 5*0(2), unit quaternions are 
needed for 50(3), which means that HI is restricted to quaternions for which 
a 2 + b 2 + c 2 + of 2 = I. Note that this forms a subgroup because the multiplication 
of unit quaternions yields a unit quaternion, and the other group axioms hold. 

The next step is to describe a mapping from unit quaternions to 50(3). Let 
the unit quaternion h = a + bi + cj + dk map to the matrix 

/2(a 2 + 6 2 )-l 2(bc-ad) 2{bd + ac) \ 
R(h)=\ 2(bc + ad) 2(a 2 + c 2 ) - 1 2{cd-ab) , (4.26) 
\ 2(bd-ac) 2{cd + ab) 2(a 2 + d 2 ) - 1/ 

which can be verified as orthogonal and det R(h) = 1. Therefore, it belongs to 
50(3). It is not shown here, but it conveniently turns out that h represents the 
rotation shown in Figure 4.9, by making the assignment 

e ■ 6 ■ ■ 6 ■ ■ 6 , 

h = cos - + V\ sin - i + v 2 sin - j + i> 3 sin - k. (4-27) 

Zi Zi Zi z 
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Unfortunately, this representation is not unique. It can be verified in (4.26) 
that R(h) = R(—h). A nice geometric interpretation is given in Figure ??. The 
quaternions h and —h represent the same rotation because a rotation of 6 about 
the direction v is equivalent to a rotation of 2n — 9 about the direction —v. 
Consider the quaternion representation of the second expression of rotation with 
respect to the first. The real part will be 

, 2 n-0, , 9 S A , AnnS 

cos(^— ) = cos(tt - -) = - cos(-) = -a. (4.28) 

The i, j, and k components will be 

— dsiii( — - — ) = — v sm(n — -) = — v sm(-) = | — b —c — d\. (4.29) 

The quaternion —h has been constructed. Thus, h and —h represent the same 
rotation. Luckily, this is the only problem, and the mapping given by (4.26) is 
two-to-one. 

This can be fixed by the identification trick. Note that the set of unit quater- 
nions is homeomorphic to § 3 because of the constraint a 2 + b 2 + c 2 + d 2 = 1. The 
algebraic properties of quaternions are not relevant at this point. Just imagine 
each h as an element of R 4 , and the constraint a 2 + b 2 + c 2 + d 2 = 1 forces the 
points to lie on S 3 . Using identification, declare h ~ —h for all unit quaternions. 
This means that the antipodal points of § 3 are identified. Recall from the end of 
Section 4.1.2 that when antipodal points are identified, then S ra / ~^KP™. Hence, 
SO(3)^MP 3 , which can be considered as the set of all lines through the original 
of 1R 4 , but this is hard to visualize. An extension of the representation of WP 2 in 
Figure 4.5 can be made to MP 3 . Start with (0, l) 3 C K 3 , and make three different 
kinds of identifications, one for each pair of opposite cube faces, and add all of the 
points to the manifold. For each kind of identification a twist needs to be made 
(without the twist, T 3 would be obtained). For example, in the Z direction, let 
(x, y, 0) ~ (1 - x : 1 - y, l)forallx, y E [0, 1]. 

A useful way to force uniqueness of rotations is to require staying in the "upper 
half" of § 3 . For example, we can require that a > 0, as long as the boundary case 
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of a = is handled properly because of antipodal points at the equator of S 3 . If 
a = 0, then we can require that b > 0. However, if a = b = 0, then we must 
require that c > because points such as (0, 0, —1, 0) and (0, 0, 1, 0) are the same 
rotation. Finally, if a = b = c = 0, then only d — 1 is allowed. If such restrictions 
are made, it is important, however, to remember the connectivity of MP 3 . If a 
path travels across the equator of § 3 , it must be mapped to the appropriate place 
in the "northern hemisphere". At the instant it hits the equator, it must move 
to the antipodal point. These concepts are much easier to visualize if you remove 
a dimension and imagine these concepts for § 2 C M 3 , as described at the end of 
Section 4.1.2. 

Using quaternion multiplication The representation of rotations boiled down 
to picking points on S 3 and respecting the fact that antipodal points give the same 
element of SO (3). In a sense, this has nothing to do with the algebraic properties 
of quaternions. It merely means that 5*0(3) can be parameterized by picking 
points in § 3 , just like SO(2) was parameterized by picking points in S 1 (ignoring 
for the antipodal identification problem for SO (3)). 

However, one important reason why the quaternion arithmetic was introduced 
is that the group of unit quaternions is also isomorphic to SO (3). This means that 
a sequence of rotations can be multiplied together using quaternion multiplication 
instead of matrix multiplication. This is important because fewer operations are 
required for quaternion multiplication in comparison to matrix multiplication. At 
any point, (4.26) can be used to convert the result back into a matrix; however, 
this is not even necessary. It turns out that a point in the world, (x, y, z) G M 3 , can 
be transformed by directly using quaternion arithmetic. An analog to the complex 
conjugate from complex numbers will be needed. For any h = a + bi + cj + dk G H, 
let h* = a — bi — cj — dk. For any point (x, y, z) G M 3 , let p G H be the quaternion 
+ xi + yj + zk. It can be shown (with a lot of algebra) that the rotated point 
(x, y, z) is given by h ■ p ■ h*. The i, j, k components of the resulting quaternion 
will be new coordinates for the transformed point. It will be equivalent to having 
transformed (x,y,z) with the matrix R{h). 

Finding quaternion parameters from a rotation matrix Recall from Sec- 
tion 3.2.3 that given a rotation matrix (3.35), the yaw, pitch, roll parameters could 
be directly determined using the atan2 function. It turns out that the quaternion 
representation can also be determined directly from the matrix. This is the inverse 
of the function in (4. 26). 9 

For a given rotation matrix (3.35), the quaternion parameters, h = a + bi + 
cj + dk can be computed as follows [155]. The first component is 

a = ^Vru+r 22 + r 33 + l, (4.30) 

9 Since that function was two-to-one, it is technically not an inverse until the quaternions are 
restricted to the upper hemisphere, as described previously. 
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and if a ^ 0, then 



b=— — — , (4.31) 
4a 

c = I^LZZ* (4.32) 
4a 

and 

a 1 = ^Lf^a. (4.33) 
4a 

If a = 0, then the previously mentioned equator problem occurs. In this case, 
then 

b = ri3ri2 (4.34) 

. / y < 2 y . 2 I y < 2 , yi 2 I yi 2 , yi 2 

V 12 13 12 23 13' 23 

(4.35) 



/ y. 2 I /V? 2 y. 2 I , V < 2 y. 2 

V ' 12' 13 12 23 13' 23 

and 

d = ri3r23 (4.36) 

. / y < 2 yl 2 I y . 2 y , 2 I ,yi 2 „ 2 

V '12' 13 '12' 23 '13' 23 

This method will fail if ri2 = T23 = or ri3 = r23 = or ri2 = T23 = 0. These 
correspond precisely to the cases in which the rotation matrix is a yaw, (3.31), 
pitch, (3.32), or roll, (3.33), which can be detected in advance. 



Special Euclidean group Now that the complicated part of representing SO (3) 
has been handled, the determination of SE(3) is straightforward. The general 
form of a matrix in SE(3) is given by (4.19), in which R e SO (3) and v G M 3 . 
Since S'0(3)=1RP 3 , and the translations are chosen independently, the resulting 
configuration space for a rigid body that rotates and translates in M 3 is 

C=R 3 x MP 3 , (4.37) 

which is a six-dimensional manifold. As expected, the dimension of C is exactly 
the number of degrees of freedom of a free-floating body in space. 



4.2.3 Chains and Trees of Bodies 

If there are multiple bodies that are allowed to move independently, then their 
configuration spaces can be combined using Cartesian products. Let Cj denote 
the configuration space of A{. If there are n free-floating bodies in W = R 2 or 
W = 1R 3 , then 

C = d x C 2 x • • • x C n . (4.38) 

If the bodies are attached to form a kinematic chain or kinematic tree, then 
each configuration space must be considered on a case-by-case basis. There is no 
general rule that simplifies the process. One thing to generally be careful about 
is that the full range of motion might not be possible for typical joints. For 
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example, a revolute might not be able to swing all of the way around to enable 
any 9 e [0,2n). If 9 cannot wind around S 1 , then the configuration space for 
this joint is homeomorphic to R instead of S 1 . A similar situation occurs for a 
spherical joint. A typical ball joint cannot achieve any orientation in SO (3) due 
to mechanical obstructions. In this case, the C-space will not be RP 3 , because 
part of 5*0(3) is missing. 

Another complication in the process of determining the configuration space 
is that the DH parameterization of Section 3.3.2 designed to facilitate the as- 
signment of coordinate frames and computation of transformations, but neglects 
considerations of topology. For example, a common approach to representing a 
spherical robot wrist is to make three zero-length lengths that each behave as a 
revolute joint. If the range of motion is limited, this might not cause problems, 
but in general the problems would be similar to using yaw, pitch, roll to represent 
5*0(3). There may be multiple ways to express the same arm configuration. 

Several examples are given below to help in determining C-spaces for chains 
and trees of bodies. Suppose W = R 2 , and there is a chain of n bodies that are 
attached by revolute joints. Suppose that the first joint is capable of rotation only 
about a fixed point (e.g., it spins around a nail). If each joint has the full range 
of motion 9i e [0, 2%), the configuration space is 

C^S 1 x S 1 x • • • x S 1 = T n . (4.39) 

However, if each joint is restricted to 0, e (— 7r/2, 7r/2), then C = R n . If any 
transformation in SE(2) can be applied to A±, then an additional R 2 is needed. 
In the case of restricted joint motions, this yields R™ +2 . If the joints can achieve 
any orientation, then C=R 2 x T n . If there are prismatic joints, then each one 
contributes an R to the C-space. 

Recall from Figure 3.12 that for W = R 3 there are six different kinds of joints. 
The cases of revolute and prismatic joints behave the same as for W = R 2 . Each 
screw joint contributes an R. A cylindrical joint contributes an R x S 1 , unless its 
rotational motion is restricted. A planar joint contributes R 2 x S 1 because any 
motion motion SE(2) is possible. If its rotational motions are restricted, then 
it contributes R 3 . Finally, a spherical joint can theoretically contribute RP 3 . In 
practice, however, this will rarely occur. It is more likely to contribute R 2 x 8 1 
or R 3 after restrictions are imposed. Note that if the first joint is a free-floating 
body, then it contributes R 3 x RP 3 . 

Kinematic trees can be handled in the same way as kinematic chains. One 
issue that has not been mentioned is that there might be collisions between the 
links. This has been ignored up to this point, but obviously this imposes very 
complicated restrictions. The concepts from Section 4.3 can be applied to handle 
this case and the placement of additional obstacles in W. Reasoning about these 
kinds of restrictions and the connectivity of the resulting space is indeed the main 
point of motion planning. 
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4.3 Configuration Space Obstacles 

Section 4.2 denned C, the manifold of robot transformations, in the absence of 
any collision constraints. Section 4.3 removes from C the configurations that 
either cause the robot to collide with obstacles or different links of the robot to 
collide with each other. The removed part is referred to as the obstacle region. 
The leftover space is precisely where the planning occurs. A motion planning 
algorithm must find a collision-free path from an initial configuration to a goal 
confniguration. Finally, after the models of Chapter ?? and the previous sections 
of this chapter, the motion planning problem can be precisely described. 

4.3.1 Definition of the Basic Motion Planning Problem 

Obstacle region for a rigid body Suppose that the world, W = I 2 or W = 
IR 3 , contains an obstacle region, O C W. Assume here that a rigid robot, A C W 
is defined; the case of multiple links will be handled shortly. Assume that both 
A and O are modeled using semi-algebraic primitives (which includes polygonal 
and polyhedral primitives) from Section 3.1. Let q G C denote the configuration 
of A, in which q = (x t ,y t ,9) for W = K 2 and q = (x t ,y t , z t ,h) for W = M 3 (h 
represents the unit quaternion). 

The obstacle region, C Q b s , is defined as 

C obs = {q G C I A(q) n O ^ 0}, (4.40) 

which is the set of all configurations, q, at which A(q), the transformed robot, 
intersects the obstacle region, O. Since O and A(q) are closed sets in W, the 
obstacle region becomes a closed set in C. 

The leftover configurations are called the free space, which is defined and de- 
noted as C/ ree = C \ C b s . Since C is a topological space and C b s is closed, then 
Cf ree must be an open set. This means that in the way the model is defined, 
the robot can come arbitrarily close to the obstacles and remain in C/ ree . If A 
"touches" O, 

int(0) n int{A(q)) = and O n A(q) ^ 0, (4.41) 

then q G C Q b s - The notion of getting arbitrarily close may be nonsense in practical 
robotics, but it makes a clean formulation of the motion planning problem. Since 
Cf ree is open, it becomes impossible to formulate some optimization problems, such 
as finding the shortest path. For such extensions, the closure, cl(Cf ree ), should be 
used, as described in Section 7.7. 

Obstacle region for multiple bodies If the robot consists of multiple bodies, 
the situation is more complicated. The definition in (4.40) only implies that the 
robot does not collide with the obstacles; however, if the robot consists of multiple 
bodies, then it might also be appropriate to avoid collisions between different 
parts of the robot. Let the robot be modeled as a collection, {A±,A2, ■ ■ ■ , An}, 
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of m links, which could be attached by joints, or may be unattached. A single 
configuration vector, q, is given for the entire collection of links. We will write 
Ai(q) for each link, i, even though some of the parameters of q may be irrelevant 
for moving link Ai- For example, in a kinematic chain, the configuration of the 
second body does not depend on the angle between the ninth and tenth bodies. 

Let P denote that set of collision pairs, in which each collision pair, (i,j), 
represents a pair of link indices i,j G {1,2, ... ,m}, such that % ^ j. If 
appears in P, it means that Ai and Aj are not allowed to be in a configuration, 
q, for which Ai(q) H Aj(q) ^ 0. Usually, P does not represent all pairs because 
consecutive links are usually in contact all of the time because of the joint between 
them. One common definition for P is that each link must avoid collisions with 
links to which it is not attached by a joint. For m bodies, P is generally of size 
0(m 2 ); however, in practice it is often possible eliminate many pairs by some 
geometric analysis of the linkage. Collisions between some pairs of links may be 
impossible over all of C, in which case, they do not need to appear in P. 

Using P, the consideration of robot self-collisions may be added to the defini- 
tion of C bs to obtain 

Cobs = \\J{qeC | A{q) nO^0}l|J< U^ GC I Mq) n^-(?) + 0} \ ■ 

(4.42) 

Thus, a configuration q G C is in C b s if at least one link collides with O, or a pair 
of links indicated by P collide with each other. 



Definition of basic motion planning Finally, enough tools have been intro- 
duced to precisely define the motion planning problem. The problem is concep- 
tually illustrated in Figure 4.11. The main difficulty is that is neither straight- 
forward nor efficient to construct an explicit boundary or solid representation of 
either C free or C obs . 

Formulation 4.3.1 (The Piano Mover's Problem) 

1. A world, W, is defined, in which either W = M 2 or W = M 3 . 

2. A semi-algebraic obstacle region O C W is defined in the world. 

3. A semi-algebraic robot is defined in W. It may be a rigid robot, A, or a 
collection of links, A\, Ai, . . . , A m . 

4. The configuration space, C, is determined by specifying the set of all possible 
transformations that may be applied to the robot. From this, C Q b s and Cf ree 
are derived. 

5. A configuration qi G Cf ree is designated as the initial configuration. 
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Figure 4.11: The basic motion planning problem is conceptually very simple using 
the configuration space ideas. The task is to find a path from to q g in C/ ree . 
The entire blob represents C = Cf ree U C ofes . 

6. A configuration q g G C/ ree is designated as the goal configuration. 

7. An algorithm must compute a (continuous) path, r : [0, 1] — > C/ ree such that 
r(0) = and r(l) = g 9 , or correctly report that such a path does not exist. 

It was shown by Reif [651] that this problem is PSPACE-hard, which implies 
NP-hard. The main problem is that the dimension of C is unbounded. 

4.3.2 Explicitly Modeling C b s : The Translational Case 

It is important to understand how to construct a representation of C a b s - In some 
algorithms, especially the combinatorial methods of Chapter 6, this represents 
an important first step to solving the problem. In other algorithms, especially 
the sampling-based planning algorithms of Chapter 5, it helps to understand why 
such constructions are avoided due to their complexity. 

The simplest case for characterizing C ofes is when C = lR n for n = 1, 2, and 3, 
and the robot is a rigid body that is restricted to translation only. Under these 
conditions, C \, s can be expressed as a type of convolution. For any two subsets of 
X,Y C W l , let their Minkowski difference, denoted by be 

X Y = {x - y G M n | x G X and y G Y}, (4.43) 

in which x — y is just vector subtraction on M n . 

In terms of the Minkowski difference, C Q b s = O .4.(0). To see this, it is 
helpful to consider a one-dimensional example. The Minkowski difference between 
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Figure 4.12: A one-dimensional example. 



X and Y can also be considered as the Minkowski sum of X and —Y. The 
Minkowski sum, ©, is obtained by simply adding elements of X and Y, as opposed 
to subtracting them. The set — Y is obtained by replacing each y G Y by —y. 
In Figure 4.12, both the robot, A = [—1,2] and obstacle region, O = [0,4] arc 
intervals in a one-dimensional world, W = R. 

The negation, —^4, of the robot is shown as the interval [—2, 1]. Finally, by 
applying the Minkowski sum to O and —A, C Q b s = [—2,4]. 

The Minkowski difference is often considered as a convolution. It can even 
be defined to appear the same as in studied in differential equations and system 
theory. For the one-dimensional example, let / : R — > {0, 1} be a function such 
that f(x) = 1 if and only if x G O. Similarly, let g : R — > {0, 1} be a function 
such that g(x) — 1 if and only if x G A. The following convolution, 

/oo 
f(r)g(x - t)o!t, 
■oo 

will yield a function h of x that is 1 if x G C Q b s , and otherwise. 

A polygonal C-space obstacle An efficient method of computing C Q b s exists 
in the case of a 2D world that contains a convex polygonal obstacle, O, and a 
convex polygonal robot, A [504]. For this problem, C ofes is also a convex polygon. 
Recall that nonconvex obstacles and robots can be modeled as the union of convex 
parts. The concepts discussed below can also be applied in the nonconvex case by 
considering C Q b s as the union of convex components, each of which corresponds to 
a convex component of A colliding with a convex component of O. 

The method is based on sorting normals to the edges of the polygons on the 
basis of angles. The key observation is that every edge of C Q b s is a translated edge 
from either A or O. In fact, every edge from O and A is used exactly once in 
the construction of C Q b s . The only problem is to determine the ordering of these 
edges of C Q b s . Let a±, a 2 , ■ ■ ., «„ denote the angles of the inward edge normals 
in counterclockwise order around A. Let fa, ■ ■ ., P n denote the outward edge 
normals to O. After sorting both sets of angles in circular order around S 1 , C b s can 
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Figure 4.13: A triangular robot and a rectangular obstacle. 
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Figure 4.14: Slide the robot around the obstacle while keeping them both in 
contact. 

be constructed incrementally by adding the edges that correspond to the sorted 
normals, in the order in which they are encountered. 

To gain an understanding of the method, consider the case of a triangular 
robot and a rectangular obstacle, as shown in Figure 4.13. The black dot on A 
denotes the origin of its coordinate frame. Consider sliding the robot around the 
obstacle in such a way that they are always in contact, as shown in Figure 4.14. 
This corresponds to the traversal of all of the configurations in dC b. s - The origin 
of A, will trace out the edges of C Q b s , as shown in Figure 4.15. There are 7 edges, 
and each edge corresponds to either an edge of A or an edge of O. The directions 
of the normals are defined as shown in Figure 4.16. When sorted as shown in 
Figure 4.17, the edges of C obs can be incrementally constructed. 

The running time of the algorithm is 0(n + m), in which n is the number of 
edges defining A, and m is the number of edges defining O. Note that the angles 
can be sorted in linear time because they already appear in counterclockwise order 
around A and O; the only need to be merged. If two edges are collinear, then 
they can be placed end-to-end as a single edge of C b s - 

The previous method quickly identifies each edge that contributes to C Q b s . This 
method can also construct a solid representation C b s in terms of half planes. This 
requires defining n + m linear equations (assuming there are no collinear edges). 

There are two different ways in which an edge of C Q b s is generated, as shown 
in Figure 4.18 [207, 504]. Type EV contact refers to the case in which an edge 
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Figure 4.17: The edge normals are sorted by orientation. 
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Type A Type B 



Figure 4.18: Two different types of contact, each of generates a different kind of 
edge [205, 504]. 



of A is in contact with a vertex of O. Type EV contacts contribute to n edges 
of C obs: once for each edge of A. Type B contact refers to the case in which an 
edge of A is in contact with a vertex of O. This contributes to m edges of C a b s . 
The relationships between the edge normals are also shown in Figure 4.18. For 
Type EV, the inward edge normal lies between the outward edge normals of the 
obstacle edges that share the contact vertex. Likewise for Type B, the outward 
edge normal of O lies between the inward edge normals of A. 

Using the ordering shown in Figure 4.17, Type EV contacts occur precisely 
when an edge normal of A is encountered, and Type B contacts occur precisely 
when an edge normal of O is encountered. The task is to determine a line equation 
at each of these instances. Consider the case of a Type EV contact; the Type B 
contact can be handled in a similar manner. In addition to the constraint on the 
directions of the edge normals, the contact vertex of O must lie on the contact 
edge of A. Recall that convex obstacles were constructed by the intersection of 
half planes. Each edge of C i JS can be defined in terms of a supporting half plane; 
hence, it is only necessary to determine whether the vertex of O lies on the line 
through the contact edge of A. This condition occurs precisely when the vectors 
n and v, shown in Figure 4.19 are perpendicular, i.e., n • v — 0. 

Note that the normal vector, n, does not depend on the configuration of A 
because it can only translate. The vector v, however, depends on the translation, 
q = (x t , Ut) of the point p. Therefore, it is more appropriate to write the condition 
as n-v(x t , yt) = 0. The transformation equations are linear for translation; hence, 
n • v — is the equation of a line in C. For example, if the coordinates of p are 
(1, 2) when A is at the origin, then the expression for p at configuration (x t , yt) is 
(l+x t ,2+y t ). Let f(x t ,y t ) = n-v. Let H = {(x t , y t ) e C \ f(x t , y t ) < 0}. Observe 
that the configurations not in H must lie in C/ ree . The half plane H is used to 
define one edge of C b s - The obstacle region C b s can be completely characterized 
by intersecting the resulting half planes for each of the Type EV and Type B 
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Figure 4.19: Contact occurs when n and v are perpendicular. 

contacts. This yields a convex polygon in C that has n + m sides, as expected. 

Example 4.3.1 Consider building a geometric model of C Q b s for the example in 
Figure 4.20. Suppose that the orientation of A is fixed as shown, and C=R 2 . In 
this case, C b s will be a convex polygon with seven sides. The contact conditions 
that occur are shown in Table 4.1. The ordering is given as normals appear as 
shown in Figure 4.17. 




Robot Obstacle 

Figure 4.20: Consider constructing the obstacle region for this example. 
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Type 


Vtx. 


O Edge 


n 


V 


Half Plane 




B 


°3 


64-61 


[1,01 

L ? J 


\xt — 2, y+1 


H B 


= {qeC\ 


x t - 2 < 0} 


B 


03 


61-62 


[0,1] 


[xt-2,^-2] 


H b 


= {q€C\ 


yt - 2 < 0} 


A 


62 


03-01 


[1,-2] 


[-x t ,2 - y t ] 


H a 


= {q£C\ 


- x t + 2y t - 4 < 0} 


B 


ai 


62-63 


[-1,0] 


[2 + x t ,y t -l] 


H b 


= {geC\ 


- x t - 2 < 0} 


A 


63 


ai-a2 


[1,1] 


[-1 -xt,-yt] 


H a 


= { q eC\ 


-x t -y t -l<0} 


B 


02 


63-64 


[0,-1] 


[x t + l,y t + 2] 


H b 


= { q eC\ 


- yt - 2 < 0} 


A 


64 


02-03 


[-2,1] 


[2-x t , -y t ] 


H a 


= {q£C\ 


2x t - y t - 4 < 0} 



Table 4.1: The various contact conditions are shown in the order as normals 
appear in Figure 4.17. 



A polyhedral C-space obstacle Most of the previous ideas generalize nicely 
for the case of a polyhedral robot that is capable of translation only in a 3D 
world that contains polyhedral obstacles. If A and O are convex polyhedra, the 
resulting C obs is a convex polyhedron. 

There are three different kinds of contacts that lead to half spaces: 

• Type FV: A face of A and a vertex of O 

• Type VF: A vertex of A and a face of O 

• Type EE: An edge of A and an edge of O 

Each half space defines a face of the polyhedron. The resulting polyhedron can 
be constructed in 0{n + m + k) time, in which n is the number of faces of A, m 
is the number of faces of O, and k is the number of faces of C Q b s , which is at most 
nm []. 

4.3.3 Explicitly Modeling C G&S : The General Case 

Unfortunately, the cases in which C a b s is polygonal or polyhedral are quite lim- 
ited. Most problems yield extremely complicated C-space obstacles. One good 
point is that C b s can be expressed using semi-algebraic models, for any robots 
and obstacles defined using semi-algebraic models, and after applying any of the 
transformations of Sections 3.2 to 3.4. It might not be true for other kinds of 
transformations, such as parameters that warp a flexible material [?]. 

Consider the case of a convex polygonal robot and a convex polygonal obstacle 
in a 2D world. Any transformation in SE(2) may be applied to A; thus, C=M 2 x S 1 
and q = (x t ,y t ,0). The task is to define a set of algebraic primitives that can 
be combined to define C Q b s - Once again, it is important to distinguish between 
Type EV and Type B contacts. We will describe how to construct the algebraic 
primitives for the Type EV contacts; Type B can be handled in a similar manner. 

For the translation-only case, we were able to determine all of the Type EV 
conditions by sorting the edge normals. With rotation, the ordering of edge nor- 
mals depends on 9. This implies that the applicability of a Type EV contact 
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depends on 9. Recall the constraint that the inward normal of A must lie between 
the outward normals of the edges of O that contain the vertex of contact. See 
Figure 4.21. This constraint can be expressed in terms of inner products using 
the vectors V\ and t> 2 . The statement regarding the directions of the normals can 
equivalently be formulated as the statement that the angle between n and v±, and 
between n and v 2 , must each be less than |. Using inner products, this implies 
that n • v\ > and n ■ v 2 > 0. As in the translation case, the condition n ■ v — is 
required for contact. Observe that n depends on q. For any q e C, if n(q) -V\ > 0, 
n(g) • i> 2 > 0, and • t> (q) > 0, then g G C/ ree . Let if/ denote the set of 
configurations that satisfy these conditions. These conditions can be used to de- 
termine whether a point is in C/ ree ; however, it is not a complete characterization 
of Cf ree ; any other Type EV and Type B contacts could add more points to C/ ree . 
Ordinarily, Hf C C/ ree , which implies that the complement, C \ Hf, is a superset 
of C bs, i-e., C b s CC \ Hf. Let Ha — C \ Hf. Let the following primitives, 

H 1 = {qeC\ n(q) ■ Vl < 0}, (4.44) 

H 2 = {q e C I n(q) ■ v 2 < 0}, (4.45) 

and 

H 3 = {qEC\n(q)-v(q)<0}, (4.46) 

define H A = H 1 UH 2 UH 3 . 

It is known that C Q b s Q Ha, but Ha may still overlap with Cf ree . The situ- 
ation is similar to what was explained in Section 3.1.1 for bulding a model of a 
convex polygon from half planes. In the current setting, it is only known that any 
configuration outside of Ha must be in Cf ree . If Ha is intersected with all other 
corresponding sets for each possible Type EV and Type B contact, the result will 
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be C bs- Each contact has the opportunity to remove a portion of C/ ree from con- 
sideration. Eventually, enough pieces of C/ ree are removed so that the only config- 
urations remaining lie in C Q b s . For any Type EV constraint, (HiUH 2 )\H 3 C C/ ree . 
A similar statement can be made for Type B constraints. A logical predicate, sim- 
ilar to that defined in Section 3.1.1, can be constructed to detect whether or not 
q G C bs in time that is linear in the number of C Q bs primitives. 

One important issue remains. The expression n(q) is not a polynomial because 
of the cos 6? and sin 6? terms in the rotation matrix of 5*0(2). If polynomials could 
be substituted for these expressions, then everything would be fixed because the 
expression of the normal vector (not a unit normal) and the inner product are 
both linear functions, thus transforming polynomials into polynomials. Such a 
substitution can be made using stereographic projection [437]; however, a simpler 
substitution can be made using complex numbers to represent rotation. Recall 
that when a + bi is used to represent rotation, each rotation matrix in 5*0(2) is 
represented as (4.21), and the 3x3 homogeneous transformation matrix becomes 

(a -b x t \ 
b a y t \. (4.47) 
ij 

Using this matrix to transform a point [x y 1] results in the point coordinates 
(ax — by + xO, bx + ay + y t ). Thus, any transformed point on A will be a linear 
function of a, b, x t , and y t . 

This was a simple trick to make a nice, linear function, but what was the cost? 
The dependency is now on a and b, instead of 9. This appears to increase the 
dimension of C from 3 to 4, and C = M. 4 . However, an algebraic primitive will be 
added to constrain the angles to lie in S 1 . 

By using complex numbers, primitives in M 4 are obtained for each Type EV 
and Type B contact. By defining C = M 4 , the following algebraic primitives are 
obtained for a Type EV contact: 

Hi = {(x t , y t , a,b)eC\ n(x t , y u a, b)-v x < 0}, (4.48) 
H 2 = {{x t , y t , a,b)eC\ n{x t , y t , a, b) ■ v 2 < 0}, (4.49) 

{(x t , y t , a,b) eC\ n(x t , y t , a, b) ■ v(x t , y t , a, b) < 0}. (4.50) 
H\ U H 2 U H 3 . To preserve the correct M. 2 x S 1 topology of C, 

H s = {(x t , y t , a, b) G C \ a 2 + b 2 - 1 = 0} (4.51) 

is intersected with Ha- This constraint preserves the topology of the original 
configuration space. The set H s remains fixed over all Type EV and Type B 
contacts; therefore, it only needs to be considered once. 



and 

H 3 = 

This yields Ha = 
the set 
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Example 4.3.2 Consider adding rotation to the model considered in Example 
4.3.1. In this case, all possible contacts must be considered. For this example, 
there are 12 Type EV contacts and 12 Type B contacts. Each contact produces 3 
algebraic primitives. With the inclusion of H s , this simple example produces 73 
primitives! Rather than construct all of these, we derive the primitives for a single 
contact. Consider the Type B contact between 03 and 64-61. The outward edge 
normal, n, remains fixed at n — [1, 0]. The vectors v\ and v 2 are derived from the 
edges that share 03, which are 03-02 and 03-01. Note that each of ai, a 2 , and 03 
depend on the configuration. Using the 2D homogeneous transformation, (3.30), 
ai at configuration (x t , y t , 6) is (cos 6 + x t , sinO + y t ). Using a + bi to represent 
rotation, the expression of ai becomes (a + Xt, 6 + yt). The expressions of a 2 and 
a 3 are (—6 + x t , a + y t ) and (—a + b + x t , — b — a + y t ), respectively. It follows that 
v\ = a 2 — a 3 = [a — 26, 2a + 6] and v 2 = ai — 03 = [2a — 6, a + 26]. Note that v\ 
and v 2 depend only on the orientation of A, as expected. Assume that v is drawn 
from 64 to a 3 . This yields v = a 3 — 64 = [—a + b + x t — 1, —a — 6 + y t + 1] . The 
inner products v 1 • n, v 2 • n, and v ■ n can easily be computed to form Hi, H 2 , and 
H3 as algebraic primitives. 

One interesting observation can be made here. The only nonlinear primitive 
is a 2 + 6 2 = I. Therefore, C obs can be considered as a linear polytope (like a 
polyhedron, but one dimension higher) in M 4 that is intersected with a cylinder. 



3D Rigid Bodies For the case of a 3D rigid body to which any transformation 
in SE(3) may be applied, the same general principles apply. The quaternion 
parameterization once again becomes the right way to represent 5*0(3) because 
using (4.26) avoids all trigonometric functions in the same way that (4.21) avoided 
them for 5*0(2). Unfortunately, (4.26) is not linear in the configuration variables, 
as it was for (4.21), but it is at least polynomial. This enables semi-algebraic 
models to be formed for C bs- Recall that there will be Type FV, VF, and EE 
contacts for case of SE(3). From all of the contact conditions, polynomials that 
correspond to each patch of C Q b s can be made. Note that these patches will be 
polynomials in seven variables: x t ,y t , z t ,a,b,c,d. Once again, a special primitive 
must be intersected with all others to enforce the constraint that unit quaternions 
are used. This reduces the dimension from 7 back down to 6. Also, constraints may 
be added to throw away half of S 3 , which is redundant because of the identification. 

Chains and Trees of Bodies For chains and trees of bodies, the ideas are con- 
ceptually the same, but the algebra becomes more cumbersome. Recall that the 
transformation for each link is obtained by a product of homogeneous transforma- 
tion matrices, as given in (3.45) and (3.49) for the 2D and 3D cases, respectively. 
If the rotation part is parameterized using complex numbers for 50(2) or quater- 
nions for 50(3), then each matrix will consist of polynomial entries. After the 



152 



S. M. LaValle: Planning Algorithms 



matrix product is formed, polynomial expressions in terms of the configuration 
variables are obtained. Therefore, a semi-algebraic model can be constructed. 
For each link, all of the contact types need to be considered. Extrapolating from 
Examples 4.3.1 and 4.3.2, you can imagine that no human would ever want to do 
all of that by hand, but at least it can be automated. It is also very important 
for the existence of theoretical algorithms that solve the motion planning problem 
combinatorially. 

If the kinematic chains were formulated for W = M 3 using the DH parameter- 
ization, it may be inconvenient to convert to the quaternion representation. One 
way to avoid this is to use complex numbers to represent each of the Q { and ctj 
variables that appear as configuration variables. This can be accomplished be- 
cause only cos and sin functions appear in the transformation matrices. These can 
be replaced by the real and imaginary parts, respectively, of a complex number. 
The dimension will be increased, but this is will be appropriately reduced when 
imposing the constraints that all complex numbers must have unit magnitude. 

4.4 Kinematic Closure and Varieties 

This section continues where the discussion at the end of Section 3.4 finished. 
Suppose that a collection of links are arranged in a way that forms loops. In 
this case, the configuration space becomes much more complicated because the 
joint angles must be chosen to ensure that the loops remain closed. This leads 
to constraints such as that shown in (3.72) and Figure 3.27, in which some links 
must maintain specified positions relative to each other. Consider the set of all 
configurations that satisfy such constraints. Is this a manifold? It turns out, 
unfortunately, that the answer is NO. However, the configuration space belongs a 
nice family of spaces from algebraic geometry called varieties. Algebraic geometry 
deals with characterizing the solution sets of polynomials. As seen so far in this 
chapter, all of the kinematics can be expressed as polynomials. Therefore, it may 
not be surprising that the resulting constraints will be a system of polynomials 
whose solution set represents the configuration space for closed kinematic linkages. 
Although the algebraic varieties considered here need not be manifolds, they can 
be decomposed into a finite collection of manifolds that fit together nicely. 10 . 

Unfortunately, a parameterization of the variety that arises from closed chains 
is available in only a few simple cases. Even the topology of the variety is ex- 
tremely difficult to characterize. To make matters worse, it was proved in [369] 
that for every closed, bounded real algebraic variety that can be embedded in 
]R ra , there exists a linkage whose configuration space is homeomorphic to it. This 
difficulty implies that most of the time, motion planning algorithms need to ma- 
nipulate implicit polynomials when searching the space. For the algebraic methods 
of Section 6.4.2, this will not pose any conceptual difficulty because they methods 



This is called a Whitney stratification [123, 772] 
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already work directly with polynomials. Sampling-based methods usually rely on 
being able to sample configurations, which cannot be easily adapted to a vari- 
ety without a parameterization. Section 7.4 covers recent methods that extend 
sampling-based planning algorithms to work for varieties that arise from closed 
chains. 

4.4.1 Mathematical Concepts 

To understand varieties, it will be helpful to have definitions of polynomials and 
their solutions that are more formal than the presentation in Chapter 3. 

Fields Polynomials are usually defined over a field, which is another object from 
algebra. A field is similar to a group, but it has more operations and axioms. The 
definition is given below, and while reading them it may be helpful to keep in 
mind several familiar examples of fields: the rationals, Q, the reals, R, and the 
complex plane, C. You may verify that these fields satisfy the six axioms below. 

A field is a set F that has two binary operations, • : F x F — > F (called 
multiplication) and + : F x F — > F (called addition), for which the following 
axioms are satisfied: 

1. (Associativity) For all a,b,cE F, (a + b) + c = a + (b + c) and (a ■ b) ■ c = 
a • (b • c). 

2. (Commutativity) For all a, b G F, a + b = b + a and a ■ b = b ■ a. 

3. (Distributivity) For all a, b, c G F, a ■ (b + c) = a ■ b + a ■ c. 

4. (Identities) There exist 0, 1 G F, such that a + = a ■ 1 = a for all a G F. 

5. (Additive Inverses) For every a G F, there exists some b G F such that 
a + b = 0. 

6. (Multiplicative Inverses:) For every a G F, except a = 0, there exists 
some c G F such that a ■ c = 1. 

Compare these axioms to the group definition from Section 4.2.1. Note that 
a field can be considered as two different kinds of groups, one with respect to 
multiplication, and the other with respect to addition. Fields additionally require 
commutativity; hence, we cannot, for example, build a field from quaternions. 
The distributivity axiom appears because there is now an interaction between 
two different operations, which was not possible with groups. 



154 



S. M. LaValle: Planning Algorithms 



Polynomials Suppose there are n variables, x\, X2, ■ ■ ■ , x n . A monomial over a 
field, F, is a product of the form 

xf -xf----xt, (4.52) 

in which all of the exponents d±, rf 2 , . . ., d n are positive integers. The total degree 
of the monomial is d\ + • • • + d n . 

A polynomial f in xi, . . . , x n with coefficients in F finite linear combination of 
monomials that have coefficients in F. A polynomial can be expressed as 



^Qmi, (4.53) 



in which m 8 is a monomial as shown in (4.52) and q 6 F is a coefficient. If q 7^ 0, 
then each Qmj is called a term. Note that the exponents, di, may be different 
for every term of /. The total degree of f is the maximum total degree among 
the monomials of the terms of /. The set of all polynomials in x±, . . . ,x n with 
coefficients in F is denoted by F[xi, . . . , x n ]. 

Example 4.4.1 The definitions correspond exactly to our intuitive notion of 
a polynomial. For example, suppose F = Q. An example of a polynomial in 

<Q[x 1 ,x 2 ,x 3 \ is 

x\ - -xix 2 x\ + x\x\ + 4. (4.54) 

Note that 1 is a valid monomial; hence, any element of F may appear alone as a 
term, such as the 4 G Q in the polynomial above. The total degree if (4.54) is 
5 due to the second term. An equivalent polynomial may be written using nicer 
variables. Using x, y, and z as variables yields 

x 4 - ^xyz 3 + x 2 y 2 + 4, (4.55) 

which belongs to Q[x, y,z}. ■ 



The set, F[xi, . . . , x n ], of polynomials is actually a group with respect to addi- 
tion; however, it is not a field. Even though polynomials can be multiplied, some 
polynomials do not have a multiplicative inverse. Therefore, the set F[ Ob ]_ j • • • j Ob I 
is often referred to as a commutative ring of polynomials. A commutative ring is 
a set with two operations for which every axiom for fields is satisfied except the 
last one, which requires a multiplicative inverse. 

Varieties For a given field F and positive integer n, the n-dimensional affine 
space over F is the set 



F n = {(ci, . . . , c n ) I ci, . . . , c n G F}. 



(4.56) 
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For our purposes in this section, an affine space can be considered as a vector 
space (the exact definition appears in []). F n is like a vector version of the scalar 
field F. Familiar examples of this are Q n , R™, and C n . 

A polynomial in ¥[x±, . . . , x n ] can be converted into function 

/ : F n -> F, (4.57) 

by substituting elements of F for each variable, and evaluating the expression 
using the field operations. This can be written as f(a\, . . . , a n ) G F, in which each 
Oj denotes an element of F that is substituted for the variable X{. 

We now arrive at an interesting question. For a given /, what are the elements 
of F n such that f(x±, . . . , x n ) = 0? We could also ask the question for some 
nonzero element, but notice that this is not necessary because the polynomial 
may be redefined for formulate the question with 0. For example, what are the 
elements of R 2 such that x 2 + y 1 = 1? This familiar equation for S 1 can be 
reformulated as to yield: what are the elements of R 2 such that x 2 + y 2 — 1 = 0? 

Let F be a field and let /i, . . . , be polynomials in F[xi, . . . , x n ]. The set 

V(h,..., f k ) = {(ai, . . . , a n ) e F | /((ai, . . . , a n ) = for all 1 < % < k}, (4.58) 

is called the (affine) variety defined by /i,...,/^. One interesting fact is that 
unions and intersections of varieties are varieties. Therefore, they behave like the 
semi-algebraic sets from Section 3.1.2, but notice that for varieties only equa- 
tions of the form / = are allowed. Consider the varieties V(f\, . . . , fk) and 
V(<7i, ...,<#). Their intersection is given by 

V(f ± , . . . , f k ) n V( 9l , ..., 9l ) = V(f h ....//,//, gi), (4.59) 

because each element of F n must be produce a value for each of the polynomials 

hi ■ ■ ■ i fk, 9i, ■ ■ ■ , 9i- 

To obtain unions, the polynomials simply need to be multiplied. For example, 
consider the varieties V 1: V 2 C F defined as 

V x = {K . . . , a n ) e ¥ | /!(ai, . . . , a n ) = 0} (4.60) 

and 

V 2 = {(ai, . . . , a n ) e F | / 2 (ai, . . . , a n ) = 0}. (4.61) 

The set V\ U V 2 C F is obtained by forming the polynomial / = /i/2- Note that 
/(ai, . . . , a n ) = if either fi(a±, . . . , a n ) = or / 2 (ai, . . . , a n ) = 0. Therefore, 
Vi U V2 is a variety. The varieties Vi and V 2 were defined using a single polynomial, 
but the same idea applies to any variety. All pairs of the form must appear 
in the list of polynomials in V( ) if there are multiple polynomials. 
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4.4.2 Kinematic Chains in R 2 

To illustrate the concepts it will be helpful to study a simple case in detail. Let 
W = IR 2 , and suppose there is a chain of links, Ai, A n , as considered in 
Example 3.3.1 for n — 3. Suppose that the first link is attached at the origin of 
W, by a revolute joint, and every other link, Ai is attached to Ai-i by a revolute 
joint. This yields the configuration space 

C^S 1 x S 1 x • x S 1 = T n , (4.62) 

the n-dimensional torus 




Two links If there are three links, Ai, A2, and A3, then the configuration 
space can be nicely visualized as a 3D cube with opposite faces identified. Each 
coordinate, 0j, ranges from to 2n, for which ~ 2n. Suppose that each link has 
length 1. This yields ai — a 2 — 1. A point, (x,y) G A 3 is transformed as 

^cos#i — sin#! 0\ /cos #2 — sin# 2 l\ /^\ 
sin^i cos^i sin 2 cosfl 2 y . (4.63) 

1/ \ 1/ \i/ 

To obtain polynomials, the technique from Section 4.2.2 is applied to replace 
the trigonometric functions using Oj = cos#j and 6, = sin#j, subject to the con- 
straint a? + b\ — 1. This results in 

a 2 -62 l\ /^\ 

62 a 2 y , (4.64) 
l) \lj 

for which the constraints af + bf = 1 for i = 1, 2 must be satisfied. This preserves 
the torus topology of C, but now it is embedded in IR 4 . The coordinates of each 
point are (01,61,02,62) £ K 4 ; however, there are only two degrees of freedom 
because each aj, 6j pair must lie on a unit circle. 

Multiplying the matrices in (4.64) yields the polynomials, /1, f 2 G 7£[ai, 61, a 2 , 6 2 ], 

/1 = xaia 2 - yai6 2 - x6i6 2 + ya 2 b\ + a x (4.65) 

and 

f 2 = -ya x a 2 + a;ai6 2 + xa 2 6i - ybib 2 + 61, (4.66) 

for the X and Y coordinates, respectively. Note that the polynomial variables 
are configuration parameters, not x and y. For a given point (x,y) in A 2 , all 
coefficients are determined. 

Now a kinematic closure constraint will be imposed. Fix the point (1,0) in A 2 
at (1,1) in W. This yields the constraints 



/1 = aia 2 - 6i6 2 + ai = 1 



(4.67) 
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Figure 4.22: There are two configurations that hold the point p at (1, 1). 

and 

f 2 = ai b 2 + a 2 b x + 6i = 1, (4.68) 
by substituting x — 1 and y = into (4.65) and (4.66). This yields the variety 

V[a x a 2 - &!& 2 + ai - 1, ai6 2 + a 2 &i + &i - 1, a? + 6^ - 1, a\ + b\ - 1), (4.69) 

which is a subset of R 4 . The polynomials are slightly modified because each 
constraint must be written in the form / = 0. 

Although (4.69) represents the constrained configuration space for the chain 
of two links, it is not very explicit. Without an explicit characterization (e.g., a 
parameterization), it complicates motion planning. From Figure 4.22 it can be 
seen that there are only two solutions. These occur for Q\ = 0, 9 2 = n/2, and 
6>i = 7r/2, 9 2 = — 7r/2. In terms of the polynomial variables, (01,61,02,62), the 
two solutions are (1, 0, 0, 1) and (0, 1, 0,-1). These may be substituted into each 
polynomial in (4.69) to verify that is obtained. Thus, the variety represents two 
points in M 4 . This can also be interpreted as two points on the torus S 1 x S 1 . 

It might not be surprising that the set of solutions has dimension zero because 
there are four independent constraints, shown in (4.69), and four variables. De- 
pending on the choices, the variety may be empty. For example, it is physically 
impossible to bring the point (1, 0) in A 2 to (1000, 0) in W. 

The most interesting and complicated situations occur when there are a con- 
tinuum of solutions. For example, if one of the constraints is removed, then a 
one-dimensional set of solutions can be obtained. Suppose only one variable is 
constrained for the example in Figure 4.22. Intuitively, this should yield a one- 
dimensional variety. Set the X coordinate to 0, which yields 



aia 2 - 6162 + cli — 0, 



(4.70) 
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and allow any possible value for y. As shown in Figure 4. 23. a, the point p must 
follow the Y axis. (This is equivalent to a three-bar linkage that can be con- 
structed by making a third joint that is prismatic and forced to stay along the Y 
axis.) Figure 4.23.b shows the resulting variety l / (a 1 a 2 — bib 2 + ai), but plotted in 
0\ — 6 2 coordinates to reduce the dimension from 4 to 2 for visualization purposes. 
To correctly interpret the figures in Figure 4.23, recall that the topology is S 1 x S 1 , 
which means that the top and bottom are identified, and also the sides are identi- 
fied. The center of Figure 4.23.b, which corresponds to {61,62) = (7r,7r), prevents 
the variety from being a manifold. The resulting space is actually homeomorphic 
to two circles that touch at a point. Thus, even with such a simple example, 
the nice manifold structure may disappear. Observe that at (n, ir) the links are 
completely overlapped, and the point p of A2 is placed at (0, 0) in W. The hori- 
zontal line in Figure 4.23.b corresponds to keeping the two links overlapping, and 
swinging them around together by varying 9 1 . The diagonal lines correspond to 
moving along configurations such as the one shown in Figure 4. 23. a. Note that 
the links and the Y axis always form an isosceles triangle, which can be used to 
show that the solution set is any pair of angles, 6\, 62 for which 62 — 77 — 61. This 
is the reason why the diagonal curves in Figure 4.23.b are linear. Figures 4.23.C 
and 4.23.d show the varieties for the constraints 

aia 2 - hb 2 + ai = ^, (4.71) 

and 

a x a 2 - hb 2 + ai = 1, (4.72) 

respectively. In these cases, the point (0, 1) in A2 must follow the x — 1/8 and 
x = 1 axes, respectively. The varieties are manifolds, which are homeomorphic 
to S 1 . The sequence from Figure 4.23.b to 4.23.d can be imagined as part of an 
animation in which the solution shrinks into a small circle. Eventually, it shrinks 
to a point for the case a\a 2 — b\b 2 + a\ — 2, because the only solution is when 
6\ — 62 — 0. Beyond this, the variety is the empty set because there are no 
solutions. Thus, but allowing one constraint to vary, four different topologies 
were obtained: 1) two circles joined at a point, 2) a circle, 3) a point, and 4) the 
empty set. 

Three links Since visualization is still possible with one more dimension, sup- 
pose there are three links, Ai, A2, and ^.3. The configuration space can be 
visualized as a 3D cube with opposite faces identified. Each coordinate, 6i, ranges 
from to 2tt, for which ~ 2tt. Suppose that each link has length 1 to obtain 
ai — a 2 — 1. A point, (x,y) G A 3 is transformed as 

cos 6*i —sin 6*i 0\ /cos 6> 2 —sm6 2 10\ /cos #3 —sin #3 10\ / x\ 
sin 6\ cos 0i Oil sin 6 2 cos 6 2 Oil sin 63 cos #3 J I y J . 

1/ \ 1 J \ i/\v 

(4.73) 
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Figure 4.23: A single constraint was added to the point p on A%, as shown in (a). 
The curves in (b), (c), and (d) depict the variety for the cases of f\ = 0, fx = 1/8, 
and / = 1, respectively. 
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To obtain polynomials, let Oj = cos#j and bi = sin#j, to obtain 

ai -&i 0\ / a 2 -b 2 l\ / a 3 -b 3 l\ /x\ 
6 X a x 6 2 a 2 6 3 «s U , (4.74) 

o o 1) \o o 1/ \o o 1/ \iy 

for which the constraints aj+bf = 1 for i = 1, 2, 3 must be satisfied. This preserves 
the torus topology of C, but now it is embedded in IR 6 . Multiplying the matrices 
yields the polynomials fx, f 2 G M[ai, &i, a 2 , & 2 , 03, b 3 ], defined as 

fx = 2a x a 2 a 3 - axb 2 b 3 + a x a 2 - 2b x b 2 a 3 - b x a 2 b 3 + ai, (4.75) 

and 

f 2 = 2b x a 2 a 3 - b x b 2 b 3 + b x a 2 + 2a x b 2 a 3 + a\a 2 b 3 , (4.76) 

for the X and Y coordinates, respectively. 

Again, consider imposing a single constraint, 

2a x a 2 a 3 - aib 2 b 3 + a x a 2 - 2b x b 2 a 3 - b x a 2 b 3 + a x = 0, (4.77) 

which constrains the point (1,0) G ^.3 to traverse the Y axis. The resulting variety 
is an interesting manifold, as depicted from three different viewpoints in Figures 
4.24 to 4.26 (remember that the sides of the cube are identified). 

By increasing the X value for the constraint on the final point, the variety can 
once again be forced to shrink. Snapshots for f\ = 7/8 and fx — 2 are shown 
in Figure 4.27. At f\ = 1, the variety is not a manifold, but changes to S> 2 . 
Eventually, this sphere is reduced to a point, at f\ = 3, and then for f\ > 3 the 
variety is empty. 

Instead of the constraint fx = 0, we could instead constrain the Y coordinate 
of p to obtain f 2 = 0. This yields another two-dimensional variety. If both 
constraints are enforced simultaneously, then the result is the intersection of the 
two original varieties. For example, suppose fx — 1 and f 2 = 0. This is equivalent 
to a kind of four-bar mechanism [], in which the fourth link, A4 is fixed along the 
X axis from to 1. The resulting variety, 

V(2axa 2 a 3 —axb 2 b 3 +axa 2 —2bxb 2 a 3 —bxa 2 b 3 +ax—l, 2bia 2 a 3 -bib 2 b 3 +bia 2 +2aib 2 a 3 +aia 2 b 3 ), 

(4.78) 

is depicted in Figure 4.28. Using Ox, 2 , 3 coordinates, the solution may be easily 
parameterized as a collection of line segments. For all t G [0,7r], there exist 
solution points at (0, 2t, ir), (t, 2n — t, 7r + t), (2n — t, t, ir — t), (2n — t, ir, ir + t), 
and (t,7r,7r — t). Note that once again, the variety is not a manifold. A family 
of interesting varieties can be generated for the four-bar mechanism by selecting 
different lengths for the links. The topologies of these mechanisms have been 
determined for both 2D [] and a 3D extension that uses spherical joints [553]. 
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Figure 4.24: The two-dimensional variety for the three-link chain with fi = 0. 

4.4.3 Defining the Variety for General Problems 

Now a general methodology for defining the variety will be described. Keeping 
the previous examples in mind will help in understanding the formulation. In the 
general case, each constraint can be thought of as a statement of the form: 

The i th coordinate of a point p G Aj needs to be held at the value x in 
the coordinate frame of Ak- 

For the variety in Figure 4.23.b, the first coordinate of a point p G A2 was held at 
the value in W (which is the same frame as for A\. The general form must also 
allow a point to be fixed with respect to the frame of links other than A\, which 
did not occur in Section 4.4.2 

Suppose that n links, A\,. . .,A n move in W = M 2 or W = 1R 3 . One link, A\ 
for convenience, is designated as the root, as defined in Section 3.4. Let denote a 
finite set of joints, in which each joint is represented as which indicates that 
Ai is attached to Aj by a joint. Is it assumed that i 7^ j. 

A linkage graph, G(V, E), is constructed from the links and joints. Each vertex 
of G represents a link in L. Each edge in G represents a joint. This definition 
may seem somewhat backwards, especially in the plane because links often look 
like edges and joints look like vertices. This assignment is also possible, but is 



162 



S. M. LaValle: Planning Algorithms 




Figure 4.25: Another view of the variety in Figure 4.24. 

not easy to generalize to the case of a single link that has more than two joints. 
If more than two links are attached at the same point, each will generate an edge 
in our representation. 

The steps to determine the polynomial constraints that express the variety 

are: 

1. Define the linkage graph, G, with one vertex per link and one edge per joint. 
If a joint connects more than two bodies, then one body must be designated 
as a junction. See Figures 4.29 and 4. 30. a. In Figure 4.30, links 4, 13, and 
23 were designated as junctions in this way. 

2. Designate one link as the root, A.\. This link may either be fixed in W, or 
transformations may be applied. In the latter case, the set of transforma- 
tions could be SE(2) or SE(3), depending on the dimension of W. This 
enables the entire linkage to move independently of its internal motions. 

3. Eliminate the loops by constructing a spanning tree, T, of the linkage graph, 
G. This implies that every vertex (or link) is reachable by a path from the 
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Figure 4.26: A third view of the variety in Figure 4.24. 



root). Any spanning tree may be used. Figure 4.30.b shows a resulting 
spanning tree after deleting the edges shown with dashed lines. 

4. Apply the techniques of Section 3.4 to assign frames and transformations to 
the resulting tree of links. 

5. For each edge of G that does not appear in T, write a set of constraints 
between the two corresponding links. In Figure 4.30.b, it can be seen that 
constraints are needed between four pairs of links: 14-15, 21-22, 23-24, and 
19-23. 

This is perhaps the trickiest part. For examples like the one shown in Figure 
3.28, the constraint may be formulated as in (3.73). This is equivalent to 
what was done to obtain the example in Figure 4.28, which means that 
there are actually two constraints, one for each of the X and Y coordinates. 
This will also work for the example shown in Figure 4.29 if all joints are 
revolute. Suppose instead that two bodies, Aj and Ak must be rigidly 
attached. This would require adding one more constraint that prevents 
mutual rotation. This could be achieved by selecting another point on Aj 
and ensuring that one of its coordinates is in the correct position in the 
frame of Ak- If four equations are added, two from each point, then one 
of them will be redundant because there are only three degrees of freedom 
possible for Aj relative to Ak (which comes from the dimension of SE(2)). 

A similar, but more complicated, situation occurs for W = M 3 . Holding a 
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-3 



fi = 7/8 



/i = 2. 



Figure 4.27: If fi > then variety shrinks. If 1 < p < 3, the variety is a sphere. 
At fi = it is a point, and for fi > 3 it completely vanishes. 



single point fixed produces three constraints. If a single point is held fixed, 
then Aj may achieve any rotation in 50(3), with respect to Ak- This implies 
that Aj and Ak are attached by a spherical joint. If they are attached by a 
revolute joint, then two more constraints are needed, which can be chosen 
from the coordinates of a second point. If Aj and Ak are rigidly attached, 
then one constraint from a third point will be needed. In total, however, 
there can be no more than six independent constraints because this is the 
dimension of SE(3). 

6. Convert the trigonometric functions to polynomials. For any 2D transforma- 
tions, the familiar substitution of complex numbers may be made. If the DH 
parameterization is used for the 3D case, then each of the cos^, sin^ terms 
can be parameterized with one complex number, and each of the cos aj,sin on 
terms can be parameterized with another. If the rotation matrix for 5*0(3) 
is directly used in the parameterization, then the quaternion parameteriza- 
tion should be used. In all of these cases, polynomial transformations will 
result. 

7. List the constraints as polynomials of the form / = 0. To write the descrip- 
tion of the variety, all of the polynomials must be set equal to zero, as was 
done for the examples in Section 4.4.2. 

It is possible to determine the dimension of the variety from the number of in- 
dependent constraints? The answer is generally NO, which can be easily seen from 
chains of links in Section 4.4.2, which produced varieties of various dimensions, 
depending on the particular equations. Techniques for computing the dimension 
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Figure 4.28: If two constraints, fx = 1 and f% = 0, are imposed, then the vari- 
eties are intersected to obtain a one-dimensional set of solutions. The example is 
equivalent to a well-studied four-bar mechanism. 

exist but require much more machinery than is presented here (see the literature 
section at the end of the chapter). However, there is a way to provide a simple 
upper bound on the number of degrees of freedom. Suppose the total degrees of 
freedom of the linkage in spanning tree form is m. Each independent constraint 
can remove at most one degree of freedom. Thus, if there are / independent 
constraints, then the variety can have no more than m — I dimensions. 

One final concern is the obstacle region, C f, s - Once the variety has been 
identified, then the obstacle region and motion planning definitions in (4.40) and 
Formulation 4.3.1 are do not need to changed, with the understanding that C 
represents the linkages that maintain loops while moving. 
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Figure 4.30: a) One way to make the linkage graph that corresponds to the linkage 
in Figure 4.29. b) A spanning tree is indicated by showing the removed edges with 
dashed lines. 
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Books are basic topology are [23, 328]. An excellent introduction to algebraic 
topology is the book by Allen Hatcher [317], which is available online at: 

http : // www . math . Cornell . edu/~hat cher/ AT/ ATpage . html 

This is a graduate-level mathematics textbook. For an undergraduate-level topol- 
ogy book that covers homology and contains many interesting examples and il- 
lustrations, see [399]. 

Much of the presentation in Section 4.4 was inspired by the nice undergraduate- 
level introduction to algebraic varieties in [178]. Examples of simple robot arms 
that form closed chains are also included. In the context of motion planning, and 
excellent source on motion planning for closed chains is the recent thesis of Juan 
Cortes [177]. 

C-space for points moving on a graph[2]. 

Mention better theoretical algorithms for computing C-space obstacles. 
Computing the dimension of algebraic varieties, etc. [178]. 

Exercises 

1. Consider the set X = {1, 2, 3, 4, 5}. Let X, 0, {1, 3}, {1, 2}, {2, 3}, {1}, {2}, 
and {3} be the collection of all subsets of X that are designated as open 
sets. Is X a topological space? Is it a topological space if {1, 2, 3} is added 
to the collection of open sets? Explain. What are the closed sets (assuming 
{1, 2, 3} is included as an open set)? Are any subsets of X neither open nor 
closed? 

2. For the letters of the Russian alphabet A, E, B, F, E, E, >K, 3, H, H, K, 
JI, M, H, O, n, P, II, T, y, II, H, in, III, ?, ?, ?, IO, H determine 
which pairs are homeomorphic. 

3. Prove the homeomorphism yields an equivalence relation on the category of 
all topological spaces. 

4. What is the dimension of the configuration space for a cylindrical rod that 
can translate and rotate in M 3 ? If the rod is rotated about its central axis, 
it is assumed that the rod's position and orientation is not changed in any 
detectable way. Express the configuration space of the rod in terms of a 
Cartesian product of simpler spaces (such as S 1 , § 2 , M n , P 2 , etc.). What is 
your reasoning? 

5. Let T\ : [0, 1] — > IR 2 be a loop path in the plane, defined as follows: Ti(s) = 
(cos(27rs), sin(27rs)). This path traverses a unit circle. Let r 2 : [0, 1] — > R 2 be 
another loop path, defined as follows: 7i(s) = (—2 + 3cos(27rs), \ sin(27rs)). 
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This path traverses an ellipse that is centered at (—2,0). Show that T\ and 
r 2 are homotopic (by constructing a continuous function with an additional 
parameter that "morphs" T\ into r 2 ). 

6. Prove that homotopy implies an equivalence relation on the set of all paths 
from some x\ G X to some x 2 G X, in which x\ and x 2 may be chosen 
arbitrarily. 

7. Determine the configuration space for a spacecraft in an asteroids game. 

8. Determine the equations for Type B constaints. 

9. Determine the configuration space for a car that drives around on a huge 
sphere (such as the earth with no mountains or oceans). Assume the sphere 
is big enough so that its curvature may be neglected (e.g., the car sits flatly 
on the earth without wobbling). [Hint: it is not § 2 x S 1 ] 

10. Show that (4.26) is a valid rotation matrix for all unit quaternions. 

11. Show that F[xi, ...,#„], the set of polynomials over a field F with variables 
xi, . . . , x n is a group with respect to addition. 

12. a) Define a unit quaternion, hi, that expresses a rotation of — | (-90 degrees) 
around the axis given by the vector [4g 4g 4g]. 

b) Define a unit quaternion, h 2 , that expresses a rotation of n around the 
axis given by the vector [0 1 0] . 

c) Suppose the rotation represented by hi is performed, followed by the 
rotation represented by h 2 . This combination of rotations can be represented 
as a single rotation around an axis given by a vector. Find this axis and the 
angle of rotation about this axis. Please convert the trig functions whenever 
possible (for example sin| = |, sin| = and sin| = ^). 

13. Suppose there are five polyhedral bodies that can float freely in a 3D world. 
They are each capable of rotating and translating. If these are treated as 
"one" composite robot, what would be the topology of the resulting config- 
uration space (assume that the bodies are NOT attached to each other)? 
What is its dimension? 

14. build the configuration space for containment 

15. The figure below shows the Mobius band defined by identification of sides 
of the unit square. Imagine that scissors are used to cut the band along the 
two dashed lines. Describe the resulting topological space. Is it a manifold? 
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16. Consider the set of points in R 2 that are remaining after a closed disk of 
radius \ with center (x, y) is removed for every value of (x, y) such that x 
and y are both integers. Is this a manifold? Explain. 




17. Show that the solution curves shown in Figure 4.28 correctly illustrate the 
variety given in (4.78). 
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Chapter 5 

Sampling-Based Motion Planning 



Chapter Status 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



There are two main philosophies for addressing the motion planning prob- 
lem, Formulation 4.3.1 from Section 4.3.1. This chapter presents sampling-based 
motion planning, which is outlined in Figure 5.1. The main idea is to avoid the 
explicit construction of C a b s , as shown in Section 4.3, and instead conduct a search 
that probes the C-space with a sampling scheme. This probing is enabled by a 
collision detection module, which the motion planning algorithm considers as a 
"black box." This enables the development of planning algorithms that are in- 
dependent of the particular geometric models. The collision detection module 
handles concerns such as whether the models are semi-algebraic, 3D triangles, 
nonconvex polyhedra, etc. This general philosophy has been very successful in 
recent years for solving problems from industrial and biological applications that 
involve thousands and even millions of geometric primitives. Such problems would 
be practically impossible to solve using explicit C D b s construction techniques. 

Section 5.1 presents metric and measure space concepts, which are fundamen- 
tal to nearly all sampling-based planning algorithms. Section 5.2 presents general 
sampling concepts and quality criteria that are effective for analyzing the perfor- 
mance of sampling-based algorithms. Section 5.3 gives a brief overview of collision 
detection algorithms, to gain an understanding of the information available to a 
planning algorithm, and the computation price that must be paid to obtain it. 
Section 5.4 presents a framework that defines algorithms which solving motion 
planning problems by integrating sampling and discrete planning (i.e., searching) 
techniques. These approaches can be considered single query in the sense that a 
single initial and goal are given, and the algorithm must search until it finds a 




171 



172 



S. M. LaValle: Planning Algorithms 



Motion Planning Algorithm 



Discrete | TC-Space 
Planning Sampling 



Figure 5.1: The sampling-based planning philosophy uses collision detection as 
a "black box" that separates the motion planning from the particular geometric 
and kinematic models. C-space sampling and discrete planning (i.e., searching) 
are performed. 

solution (or it may report early failure). Section 5.5 focuses on Rapidly-exploring 
Random Trees (RRTs) and Rapidly-exploring Dense Trees, which are used to de- 
velop efficient single-query planning algorithms. Section 5.6 covers multiple- query 
algorithms, which invest substantial preprocessing effort to build a data structure 
that is later used to obtain efficient solutions for many initial-goal pairs. In this 
case, it is assumed that the obstacles, Q remain the same for every query. 

5.1 Distance and Volume in C-Space 

Virtually all sampling-based planning algorithms require a function that measures 
distance between two points in C. In most cases, this results in a metric space, 
which is introduced in Section 5.1.1. Useful examples for motion planning are 
given in Section 5.1.2. It will also be important to many of these algorithms to 
have a notion of the volume of a subset of C. This will result in a measure space, 
which is introduced in Section 5.1.3. Section 5.1.4 introduces invariant measures, 
which should be used whenever possible. 

5.1.1 Metric Spaces 

We are all familiar with the notion of Euclidean distance in IR n . To define a 
distance function over C, it will have to satisfy certain axioms so that it coincides 
with our expectations about distances based on Euclidean distance. 

The following definition and axioms are used to create a function that converts 
a topological space into a metric space. 1 A metric space, (X,p), is a topological 
space, X, equipped with a function, p : X x X — > M such that for any a,b,c G X: 

1. (Non-negativity) p(a,b) > 

1 Some topological spaces are not metrizable, which means that no function exists that satisfies 
the axioms. There are many metrization theorems that give sufficient conditions for a topological 
space to be metrizable [328], and virtually any space that arises in motion planning will be 
metrizable. 
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2. (Reflexivity) p(a, b) = if and only if a = b 

3. (Symmetry) p(a,b) = p(b,a) 

4. (Triangle inequality) p(a,b) + p(b,c) > p(a,c). 

The function p defines distances between points in the metric space, and each 
of the four conditions on p agrees with our intuitions about distance. The final 
condition implies that p is optimal in the sense that the distance from a to c will 
always be less than or equal to the total distance obtained by traveling through 
an intermediate point 6, on the way from a to c. 



L p metrics The most important family of metrics over W 1 is given for any p > 1 



as 



p(x,x') 



E 

/i=l 



x, 



(5.1) 



For each value of p, (5.1) is called an L p metric (pronounced "el pee"). The three 
most common cases are: 

L2. The Euclidean metric, which is the familiar Euclidean distance in M. n . 

L\. The Manhattan metric, which is often nicknamed this way because in M 2 
it corresponds to the length of a path that is obtained by moving along 
an axis-aligned grid. For example, the distance from (0,0) to (2,5) is 7 by 
traveling "east two blocks" and then "north five blocks" . 

Loo: The Lqo metric must actually be defined by taking the limit of (5.1) as p 
tends to infinity. The result is 



(x , x ) — max \x^ x^ | , 

l<j<n 



(5.2) 



which seems correct because the larger the value of p, the more the largest 
term of the sum in (5.1) dominates. 

An L p metric can be derived from a norm on a vector space. An L p norm over 
W 1 is defined as 



I*mIp 



n 

E 

,i=i 



(5.3) 



The case of p = 2 is the familiar definition of the magnitude of a vector, which is 
called the Euclidean norm. For example, assume the vector space is M. n and let 
|| ■ || be the standard Euclidean norm. The L 2 metric is p(x,y) = \\x — y\\. Any 
L p metric can be written in terms of a vector subtraction, which is notationally 
convenient. 
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Metric subspaces By verifying the axioms, it can be shown that any subspace, 
Y, of a metric space, (X,p), itself becomes a metric space by restricting the 
domain of p to Y. This conveniently provides metrics on any of the manifolds 
and varieties from Chapter 4 by simply using any L p metric on M m , the space in 
which the manifold or variety is embedded. 

Cartesian products of metric spaces Metrics extend nicely across Carte- 
sian products, which is very convenient because configuration spaces are often 
constructed from Cartesian products, especially in the case of multiple bodies. 
Let (X,p x ), and (Y,p y ) be two metric spaces. A metric space, (Z,p z ), can be 
constructed for the Cartesian product Z = X x Y by defining the metric p z as 

Pz(z 1 ,z 2 ) = p(x 1 ,y 1 ,x 2 ,y 2 ) = c 1 p x (x 1 ,x 2 ) + c 2 p y (y 1: y 2 ), (5.4) 

in which c\ > and c 2 > are any positive, real constants, and Xi,x 2 G X and 
2/i,2/2 G Y. Other combinations lead to a metric for Z; for example, 

Pz(z 1 ,z 2 ) = (c 1 p%(xi,x 2 ) + c 2 p p y (yi,y 2 )) 1/p , (5.5) 

for any positive integer p. In either of these cases, two positive constants must be 
chosen. It is important to understand that many choices are possible, and there 
may not necessarily be a "correct" one. 

5.1.2 Important Metric Spaces for Motion Planning 

Example 5.1.1 (5*0(2) metric using complex numbers) If 50(2) is repre- 
sented by unit complex numbers, recall that this leads to a subset of M 2 given by 
{(a, b) G K 2 | a 2 + b 2 = 1}. Therefore, any L p metric from M 2 may be used. Using 
the Euclidean metric, 

p(ai, 6i, a 2 , b 2 ) = \f{a x - a 2 ) 2 + (pi - b 2 ) 2 . (5.6) 

for any pair of points (ai, bi) and (a 2 , b 2 ). ■ 

Example 5.1.2 (50(2) metric by comparing angles) You might have noticed 
that the previous metric for 50(2) does not give the distance traveling along the 
circle. It instead takes a short cut by computing the length of the line segment 
that connects the two points. This distortion may be undesirable. An alternative 
metric is obtained by directly comparing angles, 9\ and 9 2 . However, in this case 
special care has to be given to the identification, since there are two ways to reach 
9 2 from $i by traveling along the circle. This causes a min to appear in the metric 
definition: 

p(9 1 ,9 2 ) = minflfl! - 9 2 \, 2tt - \0i - 9 2 \), (5.7) 
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for which 61,62 G [0, 2tt]/ ~. This may be alternatively be expressed using the 
complex number representation a + hi as an angle between two vectors: 

p(ai, 61, a 2 , b 2 ) = cos _1 (aia 2 + hb 2 ), (5.8) 

for two points (ai,&i) and (02,62)- B 



Example 5.1.3 (An SE(2) metric) Again by using the subspace principle, a 
metric can easily be obtained for SE(2). Using the complex number representa- 
tion of 5*0(2), each element of SE(2) is a point (x t ,y t ,a,b) e M 4 . The Euclidean 
metric, or any other L p metric on IR 4 , can be immediately applied to obtain a 
metric. ■ 



Example 5.1.4 (SO (3) metrics using quaternions) As usual, the situation 
becomes more complicated for 5*0(3). The unit quaternions form a subset, § 3 , of 
IR 4 . Therefore, any L p metric may be used define a metric on § 3 , but this will not 
be a metric for 50(3) because antipodal points need to be identified. This leads 
to a min in the metric. Let hi,h 2 G M 4 represent two unit quaternions (which 
are being interpreted here as elements of M 4 by ignoring the quaternion algebra). 
The resulting metric is 

p(/ii,^ 2 )=min(||/n-/i 2 ||,||^i + ^ 2 ||), (5.9) 

in which the two arguments of the mean correspond to the distances from hi to 
fi2 and — h 2 , respectively. The hi + h 2 appears because h 2 was negated to yield 
its antipodal point, —h 2 . 

Just as in the case of 50(2), the metric in (5.9) may seem distorted because 
it measures the length of line segments that cut through the interior of S 3 , as 
opposed to traveling along the surface. This problem can be fixed to give a very 
natural metric for 50(3), which is based on spherical linear interpolation. This 
takes the line segment that connects the points and pushes outward onto S 3 . It 
is easier to visualize by dropping a dimension. Imagine computing the distance 
between to points on S 2 . If these points lie on the equator, then spherical linear 
interpolation yields a distance proportional to that obtained by traveling along 
the equator, as opposed to cutting through the interior of S 2 (for points not on 
the equator, use the great circle through the points). 

It turns out that this metric can easily be defined in terms of the inner product 
between the two quaternions. Recall that for unit vectors, Vi and v 2 in W 1 , 
vi-v 2 = cos 6, in which 6 is the angle between the vectors. This angle is precisely 
what is needed to give the proper distance along S 3 . The resulting metric is a 
surprisingly simple extension of (5.8): 

p(hi, h 2 ) = cos _1 (aia 2 + hb 2 + c x c 2 + did 2 ), (5.10) 
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in which each hi = (a*, bi, q, di). ■ 



Example 5.1.5 (Another SE{2) metric) A metric defined on SE (2) must com- 
pare both distance in the plane and an angular quantity. For example, even if 
Ci = C2 = 1, the range for 8 1 is [0, 2ir) using radians, but [0,360) using degrees. 
If the same constant C2 is used in either case, two very different metrics will be 
obtained. The units applied to M 2 and S 1 are completely incompatible. ■ 



Example 5.1.6 (Robot displacement metric) Sometimes this incompatibil- 
ity problem can be fixed by considering the robot displacement. For any two 
configurations (71,(72 G C, a robot displacement metric can be defined as 

p(qi,q 2 ) = max||a(gi) - a(g 2 )||, (5.11) 

in which a{qi) is the position of the point a in the world, when the robot, A is at 
configuration q^. Intuitively, the robot displacement metric yields the maximum 
amount in W that any part of the robot is displaced when moving from configu- 
ration qi to q2- ■ 



Example 5.1.7 (T n metrics) Next consider making a metric over a torus, T n . 
The Cartesian product rule (??) can be extended over every copy of S 1 (one for 
each parameter, 6i). This leads to n arbitrary coefficients, ci, C2, . . ., c n . Robot 
displacement could be used to determine the coefficients. For example, if robot 
is a chain of links, it might make sense to weight changes in the first link more 
heavily because the ensure linkage moves in this case. When the last parameter 
is changed, only the last link moves; in this case, it might make sense to give less 
weight. ■ 



Example 5.1.8 (SE(3) metrics) Metrics for SE(3) can be formed by applying 
the Cartesian product rules to a metric for M 3 and the metric for 5*0(3), which 
is given in (5.10). Again, this unfortunately leaves coefficients to choose. These 
issues will arise again in Section 5.3.4, where more details appear on robot disc 
displacement. ■ 



Pseudometrics In many planning algorithms one may want to define functions 
that behave somewhat like a distance function, but may fail to satisfy all of the 
metric axioms. If such distance functions are used, they will be referred to as 
pseudometrics. One general principle that can be used to derive pseudometrics 
is by defining the distance to be the optimal cost-to-go for some criterion (recall 
discrete cost-to-go functions from Section 2.4). 
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In the continuous setting, the cost could correspond to the distance traveled 
by a robot, or even the amount of energy consumed. Sometimes, the resulting 
pseudometric will not be symmetric. For example, it requires less energy for a 
car to travel downhill, as opposed to uphill. Suppose that a car is only capable of 
driving forward. It might travel a short distance to go forward from q 1 to some 
q2, but it might have to travel a longer distance to reach qi from q2 because it 
cannot drive in reverse. This issues arose for the Dubins car, which is covered in 
Section 13.3.1. 

Another example of a pseudometric is the concept of a potential function in 
robotics. This function is an important part of the randomized potential field 
method, which is discussed in Section 5.4.3. The idea is to make a scalar function 
that estimates the distance to the goal; however, there may be additional terms 
that attempt to repel the robot away from obstacles. This will generally cause local 
minima to appear in the distance function, which may cause potential functions 
to violate the triangle inequality. 

5.1.3 Basic Measure Theory Definitions 

This section briefly indicates how to measure volume in a metric space. This 
provides a basis for defining concepts such as integrals or probability densities. 
Measure theory is an advanced mathematical topic that is well beyond the scope 
of this book; however, it is worthwhile to briefly introduce some of the basic 
definitions because they sometimes arise in sampling-based planning. 

Measure can be considered as a function that produces real values for subsets 
of a metric space, (X,p). Ideally, we would like to produce a nonnegative value, 
p(A) e [0, oo], for any subset A C X. Unfortunately, due to the Banach-Tarski 
paradox, if X = R n , there are some subsets for which trying to assign volume 
leads to a contradiction. If X is finite, this cannot happen. Therefore, it is hard 
to visualize the problem; see [664] for a construction of these bizarre sets. Because 
of this problem, a workaround was developed that defines a collection of subsets 
that does avoids the paradoxical ones. A collection, B of subsets of X is called a 
sigma algebra if the following axioms are satisfied: 

1. The empty set is in B. 

2. If B e B, then X \ B G B. 

3. For any collection of countable number of sets in B, their union must also 
be in B. 

Note that the last two conditions together that the intersection of a countable 
number of sets in B is also in B. The sets in B are called the measurable sets. 

A nice sigma algebra, called the Borel sets, can be formed from any metric 
space (X, p) as follows. Start with the set of all open balls in X. This yields sets 
of the form 

B(x, r) = {x E X | p(x, x) < r} (5.12) 
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for any x G X and any r G [0, oo). From the open balls, Borel sets, B, are the 
sets that can be constructed from these open balls by using the sigma algebra 
axioms. For example, an open square in R 2 is in B because it can be constructed 
as the union of a countable number of balls (infinitely many are needed because 
the curved balls must converge to covering the straight square edges). By using 
Borel sets, the nastiness of nonmeasurable sets is safely avoided. 

Example 5.1.9 A simple example of B can be constructed for R. The open balls 
are just the set of all open intervals, (xi,x 2 ) C R, for any X\,x 2 G R such that 
X\ < x 2 . ■ 

Using B, a measure, /i, is now defined as a function /i : B — > [0, oo] such that 
the measure axioms are satisfied: 

1. For the empty set, //({}) = 0. 

2. For any collection, E-y, E 2 , E 3 , . . ., of a countable number of pairwise disjoint, 
measurable sets, let E denote their union. The measure, \i, must satisfy 



in which i counts over the whole collection. 

Example 5.1.10 (Lebesgue measure) The most common and important mea- 
sure is the Lebesgue measure, which becomes the standard notions of length in R, 
area in R 2 , and volume in R n for n > 3. One important concept about Lebesgue 
measure is the existence of sets of measure zero. For any countable set, A, the 
Lebesgue measure yields f-i(A) = 0. For example, what is the total length of the 
point {1} C R? The length of any single point must be zero. To satisfy the 
measure axioms, sets such as {1,3,4,5} must also have measure zero. Even infi- 
nite subsets, such as Z and Q have measure zero in R. If the dimension of a set, 
A C R m , is n for some integer n < m, then fi(A) = 0, using the Lebesgue measure 
on R m . For example, the set S 2 C R 3 has measure zero because the sphere has no 
volume. However, we might want to restrict the measure space to be § 2 and then 
define surface area. In this case nonzero measure is obtained. ■ 



Example 5.1.11 (The counting measure) If (X, p) is finite, then the count- 
ing measure can be defined. In this case, the measure can be defined over the 
entire power set of X. For any A C X, the counting measure yields fi(A) — \A\, 
the number of elements in A. Verify that this satisfies the measure axioms. ■ 




(5.13) 
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Example 5.1.12 (Probability measure) Measure theory even unifies discrete 
and continuous probability theory. The measure fi can be defined to yield proba- 
bility mass. The probability axioms are consistent with the measure axioms, while 
yields a measure space. The integrals and sums needed to define expectations of 
random variables for continuous and discrete cases, respectively, unify into single 
measure-theoretic integral. ■ 



Measure theory can be used to define very general notions of integration that 
are much more powerful than the Riemann integral that is learned in classical 
calculus. One of the most important concepts is the Lebesgue integral. Instead 
of being limited to partitioning the domain of integration into intervals, virtually 
any partition into measurable sets can be used. Its definition requires the notion 
of a measurable function to ensure that the function domain is partitioned into 
measurable sets. For further study, see [253, 664]. 

5.1.4 Using the Correct Measure 

Since many metrics and measures are possible, it may sometimes seem that there is 
no "correct" choice. This can be frustrating because the performance of sampling- 
based planning algorithms can depend strongly on these. Fortunately, there is a 
natural measure, called the Haar measure, for the transformation groups SO(N) 
and SE(N). Good metrics also follow from the Haar measure, but unfortunately, 
there are still arbitrary alternatives. 

The basic requirement is that the measure does not vary when the sets are 
transformed using the group elements. More formally, let G represent a matrix 
group with real-valued entries, and let \i denote a measure on G. If for any 
measurable subset A C G, and any element g G G, fi(A) = n{gA) = pi(Ag), then 
fi is called the Haar measure 2 for G. The notation gA represents the set of all 
matrices obtained by the product ga, for any a £ A. Similarly, Ag represents all 
products of the form ag. 

Example 5.1.13 (Haar measure for SO (2)) The Haar measure for 5*0(2) can 
be obtained by parameterizing the rotations as [0, 1]/ ~ with and 1 identified, 
and letting \x be the Lebesgue measure on the unit interval. To see the invariance 
property, consider the interval [1/4,1/2], which produces a set A C SO(2) of 
rotation matrices. These correspond to the set of all rotations from 9 = n/2 to 
9 = ii. The measure yields fJ>(A) = 1/4. Now consider multiplying every matrix 
a G A by a rotation matrix, g G SO(2), to yield Ag. Suppose g is the rotation 
matrix for 9 = it. The set Ag is the set of all rotation matrices from 9 = 2>tt/2 
up to 9 = 2n = 0. The measure, fJ,(Ag) = 1/4 remains unchanged. Similarly, 
invariance for gA may be checked. The transformation g translates the intervals 



2 Such a measure is unique up to scale, and exists for any locally-compact topological group 
[253, 664] 
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in [0, 1]/ ~. Since the measure is based on interval lengths, it is invariant with 
respect to translation. Note that fi can be multiplied by a fixed constant (such as 
2n) without affecting the invariance property. 

An invariant metric can also be defined from the Haar measure on SO(2). For 
any points x±,x 2 G [0, 1], let p = fj,([xi,x 2 ]), in which [x±, x 2 ] is the shortest-length 
(smallest measure) interval that contains X\ and x 2 as endpoints. This metric was 
already given in Example 5.1.2. 

To obtain examples that are not the Haar measure, let fi represent probabil- 
ity mass over [0, 1], and define any nonuniform probability density function (the 
uniform density yields the Haar measure). Any shifting of intervals will change 
the probability mass, resulting in a different measure. 

Note that failing to use the Haar measure weights some parts of 5*0(2) more 
heavily than others. Sometimes imposing a bias may be desirable, but it is at least 
as important to know how to eliminate bias. These ideas may appear obvious, but 
in the case of 50(3) and many other groups it is more challenging to eliminate 
this bias and obtain the Haar measure. ■ 



Example 5.1.14 (Haar measure for SO (3)) For SO (3) it turns out once again 
that quaternions come to the rescue. If unit quaternions are used, recall that 
SO (3) becomes parameterized in terms of S 3 , but opposite points are identified. 
It can be shown that the surface area on S 3 is the Haar measure. (Since S> 3 is a 
three-dimensional manifold, it may more appropriately be considered as a bound- 
ary volume.) It will be seen in Section 5.2.2 that uniform random sampling over 
SO (3) must be done with a uniform probability density over S> 3 . This corresponds 
exactly to the Haar measure. If instead, SO(3) is parameterized with Euler angles, 
the Haar measure will not be obtained. An unintentional bias will be introduced; 
some rotations in 5*0(3) will have more weight than others for no particularly 
good reason. ■ 



5.2 Sampling Theory 

5.2.1 Motivation and Basic Concepts 

The state space for motion planning, C, is uncountably infinite, yet any planning 
algorithm can consider at most a countable number of samples. If the algorithm 
runs forever, this may be countably infinite, but in practice, we expect it to ter- 
minate early after only considering a finite number of samples. This mismatch 
between the cardinality of C and the set that can be probed by an algorithm moti- 
vates careful consideration of sampling techniques. Once the sampling component 
has been defined, discrete planning methods from Chapter 2 may be adapted to 
the current setting. Their performance, however, hinges on the way the C-space 
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is sampled. 

Since sampling-based planning algorithms will often be terminated early, the 
particular order in which samples are chosen becomes critical. Therefore, a dis- 
tinction is made between a sample set and a sample sequence. A unique sample 
set can always be constructed from a sample sequence, but many sequences can 
be constructed from one sample set. 

Denseness Consider constructing an infinite sample sequence over C. What 
would be some desirable properties for this sequence? It would be nice if the 
sequence eventually reached every point in C, but this is impossible because C is 
uncountably infinite. Strangely, it is still possible for a sequence to get arbitrarily 
close to every element of C (assuming C C M" 1 ). In topology, this is the notion 
of denseness. Let U and V be any subsets of a topological space. The set U is 
said to be dense in V if cl(U) = V (recall the closure of a set from Section 4.1.1). 
This means adding the boundary points to U produces V. A simple example is 
that (0, 1) C 1 is dense in [0, 1] C R. A more interesting example is that the 
set Q of rational numbers is both countable and dense in R. Think about why. 
For any real number, such as 7r G R, there exists a sequence of fractions that will 
converge to it. The sequence fractions is a subset of Q. A sequence will be called 
dense if its underlying set is dense. The bare minimum for sampling methods is 
that that produce a dense sequence. Stronger requirements, such as uniformity 
and regularity, will be explained shortly. 

A random sequence is probably dense One of the simplest ways concep- 
tually to obtain a dense sequence is to pick points at random in [0, 1]. Suppose 
/ C [0, 1] is an interval of length e. If k samples are chosen independently at 
random, the probability that none of them falls into I is e k . As k approaches 
infinity, this probability converges to zero. This means that the probability that 
any interval in [0, 1] contains no points converges to zero. One small technicality 
exists. The infinite sequence of independently, randomly chosen points is dense 
with probability one, which is not the same as being guaranteed. This is one of the 
strange outcomes of dealing with uncountably infinite sets in probability theory. 
For example, if a number between [0, 1] is chosen at random, the probably that 
7r/4 is chosen is zero; however, it is still possible. (The probability is just the 
Lebesgue measure, which is zero for a set of measure zero.) For motion planning 
purposes, this technicality has no practical implications; however if k is not very 
large, then it might be frustrating to obtain only probabilistic assurances, as op- 
posed to absolute guarantees of coverage. The next sequence is guaranteed to be 
dense because it is deterministic. 

The van der Corput sequence A beautiful yet underutilized sequence was 
published in 1935 by van der Corput, a Dutch mathematician [759]. It exhibits 
many ideal qualities for applications. At the same time, it is based on a simple 
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Figure 5.2: The van der Corput sequence is obtained by reversing the bits in the 
binary decimal representation of the naive sequence. 

idea. Unfortunately, it is only defined for the unit interval. The quest to extend 
many of its qualities to higher-dimensional spaces motivates the formal quality 
measures and sampling techniques in the remainder of this section. 

To explain the van der Corput sequence, let C = [0, 1]/ ~, in which ~ 1 
(recall identifications from Section 4.1.2), which can be interpreted as £0(2). 
Suppose that we want to place 16 samples in C. An ideal choice is the set S = 
{i/lQ | < i < 16}, which evenly spaces the points at intervals of length 1/16. 
This means that no point in C is further than 1/32 from the nearest sample. What 
if we want to make S into a sequence. What is the best ordering? What if we 
are not even sure that 16 points are sufficient? Maybe 16 is too few or even too 
many. 

The first two columns of Figure 5.2 show a naive attempt at making S into 
sequence by sorting them by increasing value. The problem is that it after i = 8, 
half of C has been neglected. It would be preferable to have a nice covering of 
C for any i. van der Corput 's clever idea was to reverse the order of the bits, 
when the sequence is represented with binary decimals. In the original sequence, 
the most significant bit toggles only once, while the least significant bit toggles in 
every step. By reversing the bits, the most significant bit toggles in every step, 
which means that the sequence alternates between the lower and upper halves of 
C. The third and fourth columns of Figure 5.2 show the original and revered-order 
binary representations. The resulting sequence dances around [0, 1]/ ~ in a nice 
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way, as shown in the last two columns of Figure 5.2. Let v{i) denote the i th point 
of the van der Corput sequence. 

In contrast to the naive sequence, each v(i) lies far away from v{% + 1). Fur- 
thermore, the first i points of the sequence, for any i, provide reasonably-uniform 
coverage of C. When % is a power of 2, the points are perfectly spaced. For other 
i, the coverage is still good in the sense that the number of points that appear 
in any interval of length I will be roughly il. For example, when i = 10, every 
interval of length 1/2 contains roughly 5 points. 

The length, 16, of the naive sequence is actually not important. If instead 
8 was used, the same z^(l), . . ., v(8) would be obtained. Observe in the reverse 
binary column of Figure 5.2, this amounts to removing the last zero from each 
binary decimal representation, which does not alter their values. If 32 is used 
for the naive sequence, then the same ^(1), . . ., z/(16) will be obtained, and the 
sequence would continue nicely from v(17) to z/(32). To obtain the van der Corput 
sequence from z/(33) to ^(64), six-bit sequences are reversed (corresponding to the 
case in which the naive sequence has 64 points). The process repeats to produce 
an infinite sequence that does not require a fixed number of points to be a priori 
specified. In addition to the nice uniformity properties for every i, the infinite 
van der Corput sequence is also dense in [0, 1]/ ~. There implies that every open 
subset must contain at least one sample. 

You have now seen ways to generate nice samples in a unit interval both ran- 
domly and deterministically. Sections 5.2.2-5.2.4 explain how to generate dense 
samples with nice properties in the complicated spaces that arise in motion plan- 
ning. 

5.2.2 Random Sampling 

Now imagine moving beyond [0, 1] and generating a dense sample sequence for 
any bounded configuration space, C C ]R m . In this section the goal is to gener- 
ate uniform random samples. This means that the probability density function 
p(q) over C is uniform. Wherever relevant, it also will mean that the probability 
density is also consistent with the Haar measure. We will not allow any artificial 
bias to be introduced by selecting a poor parameterization. For example, pick- 
ing uniform random Euler angles does not lead to uniform random samples over 
5*0(3). However, picking uniform random unit quaternions will work perfectly 
because quaternions use the same parameterization as the Haar measure; both 
choose points on § 3 . 

Random sampling is the easiest of all sampling methods to apply to configura- 
tion spaces. One of the main reasons is that C-spaces are formed from Cartesian 
products, and independent random samples extend easily across these products. 
If X — X 1 x X 2 , and a uniform random samples, X\ and x 2 taken from X 1 and X 2 , 
respectively, then (xi,x 2 ) is a uniform random sample for X. This is very conve- 
nient in implementations. For example, if the motion planning problem involves 
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15 robots that each translate for any (xt,yt) G [0, l] 2 . This yields C = [0, l] 30 . In 
this case, 30 points can be chosen uniformly at random from [0, 1] and combined 
into a 30-dimensional vector. Samples generated this way will be uniformly ran- 
domly distributed over C. Combining samples over Cartesian products is much 
more difficult for nonrandom (deterministic) methods, presented in Sections 5.2.3 
and 5.2.4. 

Generating a random element of 5*0(3) One has to be very careful about 
sampling uniformly over the space of rotations. The probability density must 
correspond to the Haar measure, which means that a random rotation should be 
obtained by picking a point at random on § 3 and forming the unit quaternion. An 
extremely clever way to sample 5*0(3) uniformly at random is given in [26], and 
is reproduced here. Choose three points ui,U2,u 3 G [0, 1] uniformly at random. 
The random quaternion is given by the simple expression 

h = (Vl _ ui sin 27ru 2 , y/1 — u\ cos 27ru 2 , ^/u[ sin27ra 3 , y/ui cos2nu 3 ). (5.14) 

A full explanation of the method is given in [26], and a brief intuition is given 
here. First drop down a dimension and pick u±,U2 G [0,1] to generate points 
on S 2 . Let U\ represent the value for the third coordinate, (0, 0, Ui) G M 3 . The 
slice of points on §> 2 for which u\ is fixed for < u\ < 1 yields a circle on 5 2 
that corresponds to some line of latitude on § 2 . The second parameter selects the 
longitude, 27ra 2 . Unfortunately, the points will not be uniformly distributed over 
S 2 . Why? Imagine § 2 as the crust on a spherical loaf of bread that is run through 
a bread sheer. The slices are cut in a direction parallel to the equator, and are of 
equal thickness. The crusts of each slice will not have equal area; therefore, the 
points will not be uniformly distributed. However, for § 3 , the 3D crusts happen to 
have the same area (or measure); this can be shown by evaluating surface integrals. 
This implies that a (infinitesimal) slice can be selected uniformly at random with 
Ui, and a point on the crust is selected uniformly at random by w 2 and u 3 . For S 4 
and beyond, the measure of the crusts vary, which means this elegant scheme only 
works for S 3 . To respect the antipodal identification for rotations, any quaternion 
h found in the lower hemisphere (i.e., a < 0) can be negated to yield —h. This 
will not affect the uniform random distribution of the samples. 

Generating random directions Some sampling-based algorithms require choos- 
ing motion directions at random. From a configuration q, the possible directions 
of motion can be imagined as being distributed around a sphere. In an (n + 1)- 
dimensional C-space, this corresponds to sampling on S n . For example, choosing 
a direction in R 2 amounts to picking an element of S 1 ; this can be parameter- 
ized as 6 G [0, 2ir]/ ~. If n — 3, then the previously mentioned trick for 50(3) 
should be used. If n = 2 or n > 3, then samples can be generated using a 
slightly more expensive method that exploits spherical symmetries of the multi- 
dimensional Gaussian density function [251]. The method is explained for R n+1 ; 
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boundaries and identifications must be taken into account for other spaces. For 
each of the n + 1 coordinates, generate a sample, Ui, from a zero-mean Gaussian 
distribution with the same variance for each coordinate. Following from the Cen- 
tral Limit Theorem, Ui can be approximately obtained by generating k samples 
at random over [—1,1] and adding them (k < 12 is usually sufficient). The vector 
(ui, u%, ■ ■ ■ , u n+ i) gives a random direction in M. n+1 because each Ui was obtained 
independently, and the level sets of the resulting probability density function are 
spheres. We did not use uniform random samples for each Ui because this would 
bias the directions toward the corners of a cube; instead, the Gaussian yields 
spherical symmetry. The final step is to normalize the vector by taking 
for each coordinate. 

Pseudorandom number generation Although there are advantages to uni- 
form random sampling, there are also several disadvantages. This motivates the 
consideration of deterministic alternatives. Since there are tradeoffs, it is impor- 
tant to understand how to use both kinds of sampling in motion planning. One of 
the first issues is that computer-generated numbers are not random. 3 A pseudo- 
random number generator is usually employed, which is a deterministic method 
that simulates the behavior of randomness. Since the samples are not truly ran- 
dom, the advantage of extending the samples over Cartesian products does not 
necessarily hold. Sometimes problems are caused by unforeseen deterministic de- 
pendencies. One of the best pseudorandom number generators for avoiding such 
troubles is the Mersenne twister [540], for which implementations can be found 
on the internet. 

To help see the general difficulties, the classical linear congruential pseudo- 
random number generator is briefly explained [478, 579]. The method uses three 
integer parameters, M, a, and c, which are chosen by the user. The first two, M 
and a must be relatively prime, meaning gcd(M, a) = 1. The third parameter, c, 
must be chosen to satisfy < c < M. Using modular arithmetic, a sequence can 
be generated as 



by starting with some arbitrary seed 1 < yo < M. Pseudorandom numbers in 
[0, 1] are generated by the sequence 



The sequence is periodic; therefore, M is typically very large (e.g., M = 2 31 — 1). 
Due to periodicity, there are potential problems of regularity appearing in the 
samples, especially when applied across a Cartesian product to generate points in 
M. n . Particular values must be chosen for the parameters, and statistical tests are 
used to evaluate the samples either experimentally or theoretically [579]. 



Vi+i =a%)i + c mod M, 



(5.15) 



Xi = yi/M. 



(5.16) 



3 Thcre are exceptions which use physical phenomena as a random source. 
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Testing for randomness Thus, it is important to realize that even the "ran- 
dom" samples are deterministic. They are designed to optimize performance on 
statistical tests. Many sophisticated statistical test of uniform randomness are 
used. One of the simplest, the chi-square test, is described here. This test mea- 
sures how far computed statistics are their expected value. As a simple example, 
suppose C = [0, l] 2 and is partitioned into a 10 by 10 array of 100 square boxes. 
If a set, P, of k samples is chosen at random, then intuitively each box should 
receive roughly fc/100 of the samples. An error function can be defined to measure 
how far from true this intuition is: 

100 

e(P) = ~ k/100) 2 , (5.17) 

i=i 

in which fej is the number of samples that fall into box i. It is shown [391] that 
e(P) will follow a chi-squared distribution. A surprising fact is that the goal is not 
to minimize e(P). If this value is too small, we would declare that the samples are 
too uniform to be random! Imagine k = 1,000,000 and exactly 10,000 samples 
appeared in each of the 100 boxes. This yields e(P) = 0, but how likely is this 
to ever occur? The value must generally be larger (it appears in many statistical 
tables) to account for the irregularity due to randomness. 




(a) 196 pseudo-random samples (b) 196 pseudo-random samples 



Figure 5.3: Irregularity in a collection of (pseudo)random samples can be nicely 
observed with Voronoi diagrams. 

This irregularity can be observed in terms of Voronoi diagrams, as shown in 
Figure 5.3. The Voronoi diagram partitions IR 2 into regions based on the samples. 
Each sample, x, has an associated Voronoi region, Vor(x). For any point y G 
Vor(x), x is the closest sample to y using Euclidean distance. The different sizes 
and shapes of these regions gives some indication of the required irregularity of 
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(a) L 2 Dispersion (b) Dispersion 

Figure 5.4: Reducing dispersion means reducing the radius of the largest empty 
ball. 

random sampling. This irregularity may be undesirable for sampling-based motion 
planning, and is somewhat repaired by the deterministic sampling methods of 
Sections 5.2.3 and 5.2.4 (however, these methods also have drawbacks). 

5.2.3 Low-Dispersion Sampling 

This section describes describes an alternative to random sampling. Instead, the 
goal is to optimize a criterion called dispersion [579]. Intuitively, the idea is to 
place samples in a way that makes the largest uncovered area be as small as 
possible. This will yield a generalization of the idea of resolution. For a grid, the 
resolution may be selected by defining the step size for each axis. As the step size 
is decreased, the resolution increases. If a grid-based motion planning planning 
algorithm can increase the resolution arbitrarily, it becomes resolution complete. 
Using the concepts in this section, it may instead reduce its dispersion arbitrarily 
to obtain a dispersion complete algorithm. This applies to multiresolution grids 
or any other dense sample sequence. These concepts are explained further at the 
end of Section 5.4.2. 

Dispersion definition The dispersion^ of a set P of samples in a metric space 
(X,p)is 

6(P) = supminp(x,p). (5.18) 




4 The definition is unfortunately backwards from intuition. Lower dispersion means that the 
points are nicely dispersed. Thus, more dispersion is bad, which is counterintuitive. 
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(a) 196-point Sukharev grid (b) 196 lattice points 



Figure 5.5: The Sukharev grid and a lattice. 

Figure 5.4 gives an interpretation of the definition for two different metrics. An 
alternative way to consider dispersion is as the radius of the largest empty ball 
(for the Lqo metric, the balls are actually cubes). Note that at the boundary (if it 
exists), the empty ball becomes truncated because it cannot exceed the boundary. 
There is also a nice interpretation in terms of Voronoi diagrams. This in Figure 
5.3 for L 2 dispersion in R 2 . The Voronoi vertices are the points at which three or 
more Voronoi regions meet. These are points in C for which the nearest sample 
is far. An open, empty disc can be placed at any Voronoi vertex, with a radius 
equal to the distance to the three (or more) closest samples. The radius of the 
largest disc among those places at all Voronoi vertices is the dispersion. This 
interpretation extends nicely to higher dimensions. 



Making good grids Optimizing dispersion will force the points to be dis- 
tributed more uniformly over C. This causes them to fail statistical tests, but 
the point distribution is often better for motion planning purposes. Consider the 
best way to reduce dispersion if p is the metric and X = [0, 1]™. Suppose that 
the number of samples, k, is given. Optimal dispersion is obtained by partition- 
ing [0, 1] into a grid of cubes, and a point is placed at the center of each cube, as 
shown for n = 2 and k = 96 in Figure 5. 5. a. The number of cubes per axis must 
be [^"J; m which |_-J denotes the floor. If A> is not an integer, then there will 
be leftover points that may be placed anywhere without affecting the dispersion. 
Notice that &> just gives the number of points per axis for a grid of k points in n 
dimensions. The resulting grid will be referred to as a Sukharev grid [728]. 

The dispersion obtained by the Sukharev grid is the best possible. Therefore, 
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a useful lower bound can be given for any set, P, of k samples [728]: 

S(P) > —L^. (5.19) 

This implies that keeping dispersion fixed requires exponentially many points in 
dimension. 

At this point you might wonder why was used instead of L 2 , which seems 
more natural. This is because the L 2 case is extremely difficult to optimize (except 
in M 2 , where a tiling of equilateral triangles can be made, with a point in the center 
of each one). Even for simple problem of determining the best way to distribute 
a fixed number of points in [0, l] 3 is unsolved for most values of k. See [174] for 
extensive treatment of this problem. 

Suppose now that other topologies are considered, instead of [0, 1]™. Let X = 
[0, 1]/ ~, in which the identification produces a torus. The situation is quite 
different because X no longer has a boundary. The Sukharev grid still produces 
optimal dispersion, but it can also be shifted without increasing the dispersion. 
In this standard grid may also be used, which has the same number of 

points as the Sukharev grid, but is translated to the origin. Thus, the first grid 
point is (0,0), which is actually the same as 2 n — 1 other points by identification. 
If X represents a cylinder and the number of points, k, is given, then it is best to 
just use the Sukharev grid. It is possible, however, to shift each coordinate that 
behaves like S 1 . If X is rectangular, but not a square, a good grid can still be made 
by tiling the space with cubes. In some cases this will produce optimal dispersion. 
For complicated spaces such as SO (3) no grid exists in the sense defined so far. It 
is possible, however, to generate grids on the faces of an inscribed Platonic solid 
and lift the samples to § n with relatively little distortion [787]. For example, to 
sample § 2 , Sukharev grids can be placed on each face of a cube. These are lifted 
to obtain the warped grid shown in Figure 5.6. 

Example 5.2.1 Suppose that n = 2 and k = 9. If X = [0, l] 2 , then the Sukharev 
grid yields points for the nine cases in which either coordinate may be 1/6, 1/2, 
or 5/6. The dispersion is 1/6. The spacing between the points along each axis 
is 1/3, which is twice the dispersion. If instead X = [0, l] 2 / ~, which represents 
a torus, then the nine points may be shifted to yield the standard grid. In this 
case each coordinate may be 0, 1/3, or 2/3. The dispersion and spacing between 
the points remains unchanged. ■ 



One nice property of grids is that they have a lattice structure. This means 
that neighboring points can be obtained very easily be adding or subtracting 
vectors. Let gj be an n-dimensional vector called a generator. A point on a lattice 
an be expressed as 

n 

x = Y, k m, (5.20) 
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Figure 5.6: A distorted grid can even be placed over spheres and 5*0(3) by putting 
grids on faces an inscribed cube and lifting them to the surface. 
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Figure 5.7: A lattice can be considered as a grid in which the generators are not 
necessarily orthogonal. 
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for n independent generators, as depicted in Figure 5.7. In a 2D grid, the gener- 
ators represent up and right. If X = [0, 100] 2 , and a standard grid with integer 
spacing is used, then the neighbors of the point (50, 50) are obtained by adding 
(0, 1), (0, —1), (—1, 0) or (1,0). In a general lattice, the generators need not be or- 
thogonal. An example is shown in Figure 5.5.b. In Section 5.4.2 lattice structure 
will become important and convenient for defining the search graph. 

Infinite sequences Now suppose that the number, k, of samples is not given. 
The task is to define an infinite sequence that has the nice properties of the 
van der Corput sequence, but works for any dimension. This will become the 
notion of a multiresolution grid. The resolution can be iteratively doubled. For a 
multiresolution standard grid in M n , the sequence will first place one point at the 
origin. After 2 n points have been placed, there will be a grid with two points per 
axis. After 4 n points, there will be four points per axis. Thus, after 2 m points 
for any positive integer i, a grid with T points per axis will be represented. If 
we are only allowed to use complete grids, then it becomes clear why they appear 
inappropriate for high-dimensional problems. For example, if n = 10, then full 
grids appear after 1, 2 10 , 2 20 , 2 30 , etc., samples. Each doubling in resolution 
multiplies the number of points by 2 n . Thus, to use grids in high dimensions, one 
must be willing to accept partial grids, and define an infinite sequence that places 
points in a nice way. 

The van der Corput sequence can be extended in a straightforward way as 
follows. Suppose X = T 2 = [0,1] 2 / ~. The original van der Corput sequence 
started by counting in binary. The least significant bit was used to selected which 
half of [0, 1] was sampled. In the current setting, the two least significant bits can 
be used to select the quadrant of [0, l] 2 . The next two bits can be used to selected 
the quadrant within the quadrant. This procedure will continue recursively to 
obtain a complete grid after k = 1 2% points, for any positive integer i. For any 
k, however, there will be only a partial grid. The points will be distributed 
with optimal dispersion. This same idea can be applied in dimension n by 
using n bits at a time from the binary sequence to select the octant. There are 
many other orderings that produce Loo-optimal dispersion. Selecting orderings 
that additionally optimize other criteria, such as discrepancy or L 2 dispersion are 
covered in [495, ?]. Unfortunately, it is more difficult to make a multiresolution 
Sukharev grid. The base becomes 3 instead of 2; after every 3 m points a complete 
grid will be obtained. For example, in one dimension, the first point appears 
at 1/2. The next two points appear at 1/6 and 5/6. The next complete one- 
dimensional grid appears after there are 9 points. 

Dispersion bounds Since the sample sequence is infinite, it is interesting to 
consider asymptotic bounds on dispersion. It is known that for X = [0, l] n and 
any L p metric, the best possible asymptotic dispersion is 0(/c -1 / n ), for k points 
and n dimensions [579]. In this expression, k is the variable in the limit, and n 
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is treated as a constant. Therefore, any function of n may appear as a constant 
(i.e., 0(f(n)k~ l l n ) = 0(k~ l t n ) for any positive f(n)). An important practical 
consideration is the size of f(n) in the asymptotic analysis. For example, for the 
van der Corput sequence from Section 5.2.1, the dispersion is bounded by 1/k, 
which means that f(n) = 1. This does not seem good because for values of k 
that are powers of two, the dispersion is l/2k. Using a multi-resolution Sukharev 
grid, the constant becomes 3/2 because it takes a longer time before a full grid 
is obtained. Nongrid, low-dispersion infinite sequences exist that have f{n) = ^ 
[579]; these are not even uniformly distributed, which is rather surprising. 



5.2.4 Low-Discrepancy Sampling 

In some applications, selecting points that align with the coordinate axis may be 
undesirable. Therefore, extensive sampling theory has been developed to deter- 
mine methods that avoid alignments while distributing the points uniformly. In 
sampling-based motion planning, grids sometimes yield unexpected behavior be- 
cause a row of points may align nicely with an corridor in Cf ree . In some cases, a 
solution is obtained with surprisingly few samples, and in others, too many sam- 
ples are necessary. These alignment problems, when they exist, general drive the 
variance higher in computation times because it is difficult to predict when they 
will help or hurt. This provides motivation for developing sampling techniques 
that try to reduce this sensitivity. 

Discrepancy theory and its corresponding sampling methods were developed to 
avoid these problems for numerical integration [579]. Let X be a measure space, 
such as [0, l] n . Let 7Z be a collection of subsets of X that is called a range space. 
In most cases, 7Z is chosen as the set of all axis-aligned rectangular subsets; hence, 
this will be assumed from this point onward. With respect to a particular point 
set, P, and range space, 7Z, the discrepancy [768] for k samples is defined as 



D(P,K) = sup 

RdTZ 



\PHR\ n(R) 



k n(X) 



(5.21) 



in which \P D R\ denotes the number of points in P fl R. Each term in the 
supremum considers how well P can be used to estimate the volume of R. For 
example, if fi(R) is 1/5, then we would hope that about 1/5 of the points in P 
fall into R. The discrepancy measures the largest volume estimation error that 
can be obtained over all sets in TZ. 



Asymptotic bounds There are many different asymptotic bounds for discrep- 
ancy, depending on the particular range space and measure space [538]. The most 
widely referenced bounds are based on the standard range space of axis-aligned 
rectangular boxes in [0, 1]™. There are two different bounds, depending on whether 
or not the number of points, k, is given. The best possible asymptotic discrep- 
ancy for a single sequence is 0(k~ l log™ k). This implies that k is not specified. 
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Figure 5.8: Discrepancy measures whether the right number of points fall into 
boxes. It is related to the chi-square test, but optimizes over all possible boxes. 

If, however, for every k a new set of points can be chosen, then the best possible 
discrepancy is 0(k~ l log™ -1 k). This lower bound corresponds to the best that 
can be achieved by a sequence of point sets, as opposed to a single sequence. 

Relating Dispersion and Discrepancy Since balls have volume, there is a 
close relationship between discrepancy, which is measure-based, and dispersion, 
which is metric-based. For example, for any P C [0, l] n , 

5(P,Loo) < D(P,1l) l/d , (5.22) 

which means low-discrepancy implies low-dispersion. Note that the converse is 
not true. An axis-aligned grid yields high discrepancy because of alignments with 
the boundaries of sets in 71, but the dispersion is very low. Even though low- 
discrepancy implies low-dispersion, lower dispersion can usually be obtained by 
ignoring discrepancy (this is one less constraint to worry about). Thus, there is a 
tradeoff that must be carefully considered in applications. 

Low-discrepancy sampling methods Due to the fundamental importance of 
numerical integration, and the intricate link between discrepancy and integration 
error, most of the literature has led to low-discrepancy sequences and point sets 
[579, 712, 744]. Although motion planning is quite different from integration, it 
is worth evaluating these carefully-constructed and well-analyzed samples. Their 
potential use in motion planning is no less reasonable than using pseudo-random 
sequences, which were also designed with a different intention in mind (satisfying 
statistical tests of randomness). 

Low-discrepancy sampling methods can be divided into three categories: 1) 
Halton/Hammersley sampling, 2) (t,s)-sequences and (t,m,s)-nets, and 3) lattices. 
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The first category represents one of the earliest methods, based on extending the 
van der Corput sequence. The Halton sequence is an n-dimensional generalization 
van der Corput sequences, but instead of using binary representations, a different 
basis is used for each coordinate [311]. The result is a reasonable deterministic 
replacement for random samples in many applications. The resulting discrepancy 
(and dispersion) is lower than that for random samples (with high probability). 
Figure 5. 9. a shows the first 196 Halton points in M 2 . 

Choose n relatively prime integers pi,p 2 , ■ ■ ■ ,p n (usually the first n primes, 
Pi — 2, p 2 — 3, . . ., are chosen). To construct the i th sample, consider the 
digits of the base p representation for i in the reverse order (that is, write i = 
CLo + pai + p 2 a 2 +p 3 ds + . . ., where each Oj G {0, 1, . . . ,p}) and define the following 
element of [0, 1]: 

r^) = - + ^ + a -l + - i +---- (5-23) 

p p l p d J9 4 

The % th sample in the Halton sequence is 

(r pi (i),r P2 (i),...,r p „(i)), < = 0,1,2,.... (5.24) 

Suppose instead, that k, the required number of points is known. In this case, 
a better distribution of samples can be obtained. The Hammersley point set is an 
adaptation of the Halton sequence [312]. Using only d—1 distinct primes, the i th 
sample in a Hammersley point set with k elements is 

(J, r Pl (i),..., r Vd _, {ifj , i = 0, 1, . . . , N - I. (5.25) 

Figure 5.9.b shows the Hammersley set for n = 2 and k = 196. 

The construction of Halton/Hammersley samples is simple and efficient, which 
has led to widespread application. They both achieve asymptotically optimal 
discrepancy; however, the constant in their asymptotic analysis increases more 
than exponentially with dimension [579]. The constant for the dispersion also 
increases exponentially, which is must worst than for the methods of Section 5.2.3. 

Improved constants are obtained for sequences and finite points by using (t,s)- 
sequences, and (t,m,s)-nets, respectively [579]. The key idea is to enforce zero 
discrepancy over a particular subset of 1Z known as canonical rectangles, and all 
remaining ranges in 1Z will contribute small amounts to discrepancy. The most 
famous and widely-used (t,s)-sequences are Sobol' and Faure (see [579]). The 
Niederreiter-Xing (t,s)-sequence has the best-known asymptotic constant, (a/d) d , 
in which a is a small constant [581]. 

The third category is lattices, which can be considered as a generalization of 
grids that allows nonorthogonal axes [538, 712, 765]. As an example, consider 
Figure 5.5.b, which shows 196 lattice points generated by the following technique. 
Let a be a positive irrational number. For a fixed k (lattices are closed sample 
sets), generate the i th point according to (|, {ia}), in which {•} denotes the frac- 
tional part of the real value (modulo-one arithmetic). In Figure 5.5.b, a = 



5.3. COLLISION DETECTION 



195 




(a) 196 Halton points (b) 196 Hammersley points 



Figure 5.9: The Halton and Hammersley points are easy to construct and provide 
a nice alternative to random sampling that achieves more regularity. Compare 
the Voronoi regions to those in Figure 5.3. Beware that although these sequences 
produce asymptotically optimal discrepancy, their performance degrades substan- 
tially in higher dimensions (e.g., beyond 10). 

the Golden Ratio. This procedure can be generalized to d dimensions by picking 
d — 1 distinct irrational numbers. A technique for choosing the a parameters by 
using the roots of irreducible polynomials is discussed in [538]. The i th sample in 
the lattice is 

{ia„_i}^ . (5.26) 

Recent analysis shows that some lattice sets achieve asymptotic discrepancy 
that is very close to that of the best-known non-lattice sample sets [323, 745]. 
Thus, restricting the points to lie on a lattice seems to entail little or no loss in 
performance, but with the added benefit of a regular neighborhood structure that 
is useful for path planning. Historically, lattices have required the specification 
of k in advance; however, there has been increasing interest in extensible lattices, 
which are infinite sequences [324, 745]. 

5.3 Collision Detection 

Collision detection is a critical component of sampling-based planning. Even 
though it is often treated as a black box, it is important to study its inner work- 
ings to understand the information it provides and its associated computational 
cost. In most applications, the majority of computation time is spent in collision 
checking, as opposed to planning. 
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A variety of collision detection algorithms exist, ranging from theoretical algo- 
rithms that have excellent computational complexity to heuristic, practical algo- 
rithms whose performance is tailored to a particular application. The techniques 
from Section 4.3 can, of course, be used to develop a collision detection algorithm 
by defining a logical predicate using the geometric model of C Q b s . For the case of a 
2D world, with a convex robot and obstacle, this leads to an linear-time collision 
detection algorithm. 



5.3.1 Basic Concepts 

Just as in Section 3.1.1, collision detection may be viewed as a logical predicate. In 
the current setting it appears as <fi : C — > {true , false }, in which the domain 
is C instead of W. If q G C D b s , then <p(q) = TRUE ; otherwise, <p(q) = FALSE . 



Hausdorff Distance For the boolean-valued function, <p, there is no informa- 
tion about how far the robot is from hitting the obstacles. Such information is 
very important in planning algorithms. A distance function provides this infor- 
mation, and is defined as d : C — > [0, oo), in which the real- value in the range of / 
indicates the distance in the world, W, between the closest pair of points over all 
pairs from A(q) and O. In general, for two closed, bounded subsets, E and F, of 
M. n , the Hausdorff distance is defined as 

p(E,F) =mmmm||e-/||, (5.27) 

in which || • || is the Euclidean norm. Clearly, if E n F ^ 0, then p(E, F) = 0. 
The methods described in this section may be used to either compute distance, 
or only determine whether q G C Q b s - In the latter case, the computation is often 
must faster because less information is required. 



Two-phase collision detection Suppose that the robot is a collection of m 
attached links, A\, A2, ■ ■ ■, A m , and that O has k connected components. For this 
complicated situation, collision detection can be viewed as a two-phase process. 

1. In the broad phase, the task is to avoid performing expensive computations 
for bodies that are far from each other. Simple bounding boxes can be 
placed around each of the bodies, and simple tests can be performed to 
avoid costly collision checking unless the boxes overlap. Hashing schemes 
can be employed in some cases to greatly reduce the number of pairs of 
boxes that have to be tested for overlap [?]. For a robot that consists of 
multiple bodies, the pairs of bodies that should be considered for collision 
must be specified in advance, as described in Section 4.3.1. 



2. In the narrow phase, individual pairs of bodies are each checked carefully 
for collision. Approaches to this phase are described in Sections 5.3.2 and 
5.3.3. 
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a. 



b. 



c. 



d. 



Figure 5.10: Four different kinds of bounding regions: a) sphere, b) axis-aligned 
bounding box (AABB), c) oriented bounding box (OBB), d) convex hull. Each 
one usually provides a tighter approximation than the previous one, but is more 
expensive to test for overlapping pairs. 

5.3.2 Hierarchical Methods 

In this section, suppose that two complicated, nonconvex bodies, E and F, are 
to be checked for collision. Each body could be part of either the robot or the 
obstacle region. They are subsets of R 2 or IR 3 , defined using any kind of geometric 
primitives, such as triangles in M 3 . Hierarchical methods generally represent each 
body as a tree in which each node represents a bounding region that contains all 
of the points in one portion of the body. The bounding region of the root node 
contains all of the points in the body. 

There are generally two opposing criteria that guide the selection of the type 
of bounding region:: 

1. The region should fit the actual data as tightly as possible. 

2. The intersection test for two regions should be as efficient as possible. 

Several popular choices are shown in Figure 5.10 for an L-shaped body. 

The tree is constructed for a body, E (or alternatively, F) recursively as follows. 
For each node, consider the set, X, of all points in E that are contained in 
the bounding region. Two child nodes are constructed by defining two smaller 
bounding regions whose union covers X. The split is made so that the portion 
covered by each child is of similar size. If the geometric model consists of primitives 
such as triangles, then a split could be made separate the triangles into two sets 
of roughly the same number of triangles. A bounding region is then computed 
for each of the children. Figure 5.11 shows an example of a split for the case of 
an L-shaped body. Children are generated recursively by making splits until very 
simple sets are obtained. For example, in the case of triangles in space, a split is 
made unless the node represents a single triangle. In this case, it is easy to test 
for intersection of two triangles. 

Consider the problem of determining whether bodies E and F are in collision. 
Suppose that a trees, T e and Tf, have been constructed for E and F, respectively. 
If the bounding regions of the root nodes of T e and Tf do not intersect, then it 
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Figure 5.11: The large circle shows the bounding region for a node that covers an 
L-shaped body. After performing a split along the dashed line, two smaller circles 
are used to cover the two halves of the body. Each circle corresponds to a child 
node. 

is known that T e and Tf are not in collision without performing any additional 
computation. If the bounding regions do intersect, then the bounding regions of 
the children of T e are compared to the bounding region of Tf. If either of these 
intersect, then the bounding region of Tf is replaced with the bounding regions 
of its children, and the process continues recursively. As long as the bounding 
regions overlap, lower levels of the trees will be traversed, until eventually the 
leaves are reached. If triangle primitives are used for the geometric models, then 
at the leaves, the algorithm will test the individual triangles for collision, instead of 
bounding regions. Note that as the trees are traversed, if a bounding region from 
the node, ri\, of T e does not intersect the bounding region from a node, n,2, of Tf, 
then no children of ri\ have to be compared to children of n±. This can generally 
result in dramatic reduction in comparison to the amount of comparisons needed 
in a naive approach that, for example, tests all pairs of triangles for intersection. 

It is possible to extend the hierarchical collision detection scheme to also com- 
pute distance. If at any time, a pair of bounding regions have a distance greater 
then the smallest distance computed so far, then their children do not have to be 
considered [493]. 

5.3.3 Incremental Methods 

This section focuses on a particular approach called incremental distance com- 
putation, which assumes that between successive calls to a when the collision 
detection algorithm, the bodies move only a small amount. Under this assump- 
tion, the algorithm achieves "almost constant time" performance for the case of 
convex polyhedral bodies [492, 556]. Nonconvex bodies can be decomposed into 
convex components. 

These collision detection algorithms seem to offer wonderful performance, but 
this comes at a price. The models must be coherent, which means that all of 
the primitives must fit together nicely. For example, if a 2D model uses line 
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Figure 5.12: The Voronoi regions alternate between being edge-based and vertex- 
based. The Voronoi regions of vertices are labeled with a "V", and the Voronoi 
regions of edges are labeled with an "E" . Note that the Voronoi regions alternate 
between "V" and "E" (no two Voronoi regions of the same kind are adjacent). 
The adjacencies between these Voronoi regions follow the same pattern as the 
adjacencies between vertices and edges in the polygon (a vertex is always between 
two edges, etc.). 

segments, all of the line segments must fit together perfectly to form polygons. 
There can be no isolated segments or chains of segments. In 3D, polyhedral models 
are required to have all faces comes together perfectly to form the boundaries of 
three-dimensional shapes. It cannot be an arbitrary collection of 3D triangles. 

The method will be explained for the case of 2D convex polygons. Voronoi 
regions will be defined for a convex polygon, in terms of features. The feature set 
is the set of all vertices and edges of a convex polygon. Thus, a polygon with 
n edges has 2n features. Any point outside of the polygon has a closest feature 
in terms of Euclidean distance. For a given feature, g, the set of all points from 
which g is the closest feature is the Voronoi region of g, denoted Vor(g). Figure 
5.12 shows all ten Voronoi regions for a pentagon. 

For any two convex polygons that do not intersect, the closest point will be 
determined by a pair of points, one on each polygon (usually the points are unique, 
except in the case of parallel edges). Consider the feature for each point in this 
pair. There are only three possible combinations: 

• Edge-Edge Each point of the closest pair each lies on an edge. In this case, 
the edges must be parallel. 

• Edge- Vertex One point of the closest pair lies on an edge, and the other 
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lies on a vertex. 

• Vertex- Vertex Each point of the closest pair is a vertex of a polygon. 

Let g e and gf represent any feature pair of E and F , respectively. Let (x e , y e ) G 
g e and (xf,yf) G gf denote the closest pair of points, among all pairs of points in 
g e and gf, respectively. The following condition can be used to determine whether 
the distance between (x e ,y e ) and (xf,yf) is the distance between E and F: 

(x f ,y f ) G Vor(g e ) and (a e ,y e ) G Vor(g f ) (5.28) 

If this condition is satisfied for a given feature pair, then the distance between 
E and F equal to the distance between g e and gf. This implies that the distance 
between E and F can be determined in constant time. The assumption that E 
moves only a small amount is made to increase the likelihood that the closest 
feature pair will remain the same. This is why the phrase "almost constant time" 
is used to describe the performance of the algorithm. Of course, it is possible 
that the closest feature pair will change. In this case, neighboring features can be 
tested using the condition above, until the new closest pair of features is found. 
In this worst case, this search could be costly, but this violates the assumption 
that the bodies to not move far between successively calls. 

The same ideas can be applied for the 3D case in which the bodies are convex 
polyhedra [492, 556]. The primary difference is that three kinds of features are 
considered: faces, edges, and vertices. The cases become more complicated, but 
the idea is the same. Once again, the condition regarding mutual Voronoi regions 
holds, and the algorithm has nearly constant time performance. 

5.3.4 Checking a Path Segment 

Collision detection algorithms determine whether a configuration lies in C/ ree , but 
motion planning algorithms require that an entire path maps into C/ ree . The 
interface between the planner and collision detection usually involves validation 
of a path segment (i.e., a path, but often a short one). This cannot be checked 
point-by-point because it would require an uncountably infinite number of calls 
to the collision detection algorithm. 

Suppose that a path, r : [0, 1] — > C. needs to be checked to determine whether 
r([0, 1]) C Cf ree . A common approach is to sample the interval [0, 1], and call the 
collision checker only on the samples. What resolution of sampling is required? 
How can one ever guarantee that the places where the path is not sampled are 
collision free? There are both practical and theoretical answers to these questions. 
In practice, a fixed Ag is chosen as the configuration space step size. Points t 1? t 2 G 
[0, 1] are then chosen close enough together to ensure that p{r{ti), r{t 2 )) < Aq, in 
which p is the metric on C. The value of Ag is often determined experimentally. 
If Ag is too small, then considerable time is wasted on collision checking. If Ag 
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is too large, then there is a chance that the robot could jump through a thin 
obstacle. 

Setting Aq empirically might not seem satisfying. Fortunately, there are sound 
algorithmic ways to verify that a path is collision free. In some cases the methods 
are still not used because they are trickier to implement and they often yield 
worse performance. Therefore, both methods are presented here, and you can 
decide which is appropriate, depending on the context and your personal tastes. 

Ensuring that r([0, 1]) C Cf ree requires the use of both Hausdorff distance 
information and bounds on the distance that points on A can travel in R. Such 
bounds can be obtained by using the robot displacement metric from Example 
5.1.6. Before expressing the general case, first the concept will be explained in 
terms of a rigid robot that translates and rotates in W = M 2 . Let xt, yt £ R 2 and 
9 G [0,27r]/ ~. Suppose that a collision detection algorithm indicates that A(q) 
is at least d units away from collision with obstacles in W. This information can 
be used to determine a region in C/ ree that contains q. Suppose that the next 
candidate configuration to be checked along r is q'. If no point on A travels more 
than distance d when moving from q to q' along r, then q' and all configurations 
between q and q' must be collision free. This assumes that the path from q to 
q' is monotonic (if the robot can take any path between q and q', then no such 
guarantee could possibly be made). 



Y 




























A 




X' 



Figure 5.13: The furthest point on A from the origin travels the fastest when 
rotated. At most it can be displaced by 2iir, if x t and y t are fixed. 

When A undergoes a translation, all points move the same distance. For 
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rotation, however, the distance traveled depends on how far the point on A is 
from the rotation center, (0,0). Let a r = (x r ,y r ) denote the point on A that 
has the largest magnitude, r = \Jx^ + y%. Figure 5.13 shows an example. A 
transformed point, a G A may be denoted by a(x t ,y t , 9). The following bound is 
obtained for any a G A, if the robot is rotated from orientation 9 to 9': 

\a(x t ,y t ,9) - a(x t ,y t ,9')\ < \a r (x t ,y t ,9) - a r (x t ,y t ,9')\ < r\9 - 9'\, (5.29) 

assuming that a path in C is followed that interpolates between 9 and 9' (using 
the shortest path in S 1 between 9 and 9'). Thus, if A(q) is at least d away from 
the obstacles, then the orientation may be changed as long as r\9 — 9'\ < d. Note 
that this is a loose upper bound since a r travels along a circular arc, and can be 
displaced by no more than 2iir. 

Similarly, Xt and yt may individually vary up to d, yielding \xt — x' t \ < d and 
\y t — y' t \ < d. If all three parameters vary at same time, then a region in C/ ree may 
be defined as 

{(x' t , y' t , 9')eC\ \x t - x' t \ + \y t - y' t \ + r\9 - 9'\ < d. (5.30) 

Such bounds can generally be used to set the step size, Aq, for collision checking 
that guarantees the intermediate points lie in C/ ree . The particular value used 
may vary depending on d and the direction 5 of the path. 

For the case of SO(3), once again the displacement of the point on A that 
has the largest magnitude can be bounded. It is best in this case to express the 
bounds in terms of quaternion differences, \\h — h'\\. Euler angles may also be used 
to obtain a straightforward generalization of (5.30) that has six terms, three for 
translation and three for rotation. For each of the three rotation parts, a point 
with largest magnitude in the plane perpendicular to the rotation axis must be 
chosen. 

If there are multiple links, it becomes much more complicated to determine the 
step size. Each point a G Ai s transformed by some nonlinear function based on 
the kinematic expressions from Sections 3.3 and 3.4. Let a : C — > W denote this 
transformation. In some cases, it might be possible to derive a Lipschitz bound 
of the form 

\\a(q) - a(q')\\ <c||g-g'||, (5.31) 

in which c G (0, oo) is a fixed constant, a is any point on Ai, and the expression 
holds for any q, q' G C. The goal is to make c as small as possible to enable larger 
variations in q. 

A better method is to individually bound the link displacement with respect 
to each parameter, 

IK<?i, • • • , Qi-i, Qi, Qi+i, • • • , 5n) - a(Qi, Qi-i, Qi, Qi+i, ■ ■ ■ , Qn)\\ < cMi ~ <&\, 

(5.32) 

5 To formally talk about directions, it would be better to define a differentiable structure on 
C. This will be deferred to Section ??, where it seems unavoidable. 
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to obtain the Lipschitz constants c±, . . ., c n . The bound on robot displacement 
becomes 

n 

\\a(q) - a(q')\\ <^c i |g i - g ;|. (5.33) 
i=i 

The benefit of using individual parameter bounds can be seen by considering a long 
chain. Consider a 50-link chain of line segments in M 2 , and each link has length 
10. The configuration space is T 50 , which can be parameterized as [0,27r] 5 0/ ~. 
Suppose that the chain is in a straight-line configuration {9i = for all 1 < % < n), 
which means that last point is at the point (500, 0). Changes in 9i, the orientation 
of the first link, will dramatically move A$o. However, changes in 6*50 will move 
A50 a smaller amount. Therefore, it is advantageous to pick different Aqi for each 
1 < i < n. In this example, a smaller value should be used for A61 in comparison 
to A9 50 . 

Unfortunately, there are more complications. Suppose the 50-link chain is in 
a configuration that folds all of the links on top of each other (6i = tt for each 
1 < i < n). In this case, A50 does not move as fast when d\ is perturbed, in 
comparison to the straight-line configuration. A larger step size for 6\ could be 
used for this configuration, relative to other parts of C. The implication is that 
although Lipschitz constants can be made to hold over all of C, it still might be 
preferable to determine in a local region around q G C how much link displacement 
is possible with respect to each parameter perturbation. A linear method could 
be obtained by analyzing the Jacobian of the transformations, such as (3.45) and 
(3.49). 

Another important concern when checking a path is the order in which the 
samples are checked. For simplicity, suppose that Aq is constant and that the path 
is a constant-speed parameterization. Should the collision checker step along from 
up to 1? Experimental evidence indicates that it is best to use recursive binary 
strategy [272]. This will make no difference if the path is collision-free, but it 
often saves time when the path is in collision. This is a kind of sampling problem 
over [0, 1], which is addressed nicely by the van der Corput sequence, v. The last 
column in Figure 5.2 indicates precisely where to check along the path in each 
step. Initially, r(l) is checked. Following this, points from the van der Corput 
sequence are checked in order: r(0), r(l/2), r(l/4), r(3/4), r(l/8), .... The 
process terminates if a collision is found, or when the dispersion falls below Aq. 
If Aq is not constant, then it is possible to skip over some points of v in regions 
where the allowable variation is larger. 

5.4 Incremental Sampling and Searching 
5.4.1 The General Framework 

The algorithms of Sections 5.4 and 5.5 follow the single query model, which means 
qi and q g are given only once per robot and obstacle set. This means that there are 
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no advantages to precomputation, and the sampling-based motion planning prob- 
lem can be considered as a kind of search. In fact, these sampling-based planning 
algorithms are strikingly similar to the family of search algorithms summarized 
in Section 2.3.4. The main difference lies in Step 3 below, in which applying an 
action, u, is replaced by generating a path segment, r s . Another difference is 
that G is an undirected graph whose edges represent paths, as opposed to a di- 
rected graph who edges represent actions. It is possible to make these look similar 
by defining an action space for motion planning that consists of a collection of 
paths, but this is avoided here. In the case of motion planning with differential 
constraints, this will actually be required; see Chapter 15. 

Most single-query sampling-based planning follow this template: 

1. Initialization: Let G(V, E) represent an undirected search graph, for which 
the node set, V contains a node for qi and possibly other states in C/ ree , and 
the edge set, E, is empty. 

2. Vertex Selection Method (VSM): Choose a vertex q cur e V for expan- 
sion. 

3. Local Planning Method (LPM): For some q new G C/ ree which may or 
may not be represented by a vertex in V, attempt to construct a path 
r s : [0, 1] — > Cf ree such that r(0) = q cur and r(l) = q new . Using the methods 
of Section 5.3.4, r s must be checked to ensure that it does not cause a 
collision. If this step fails to produce a collision-free path segment, then go 
to Step 2. 

4. Insert Edge in Graph: Insert t s into E, as an edge from q cur to q new . If 
q new is not already in V, it is added. 

5. Check for Solution: Determine whether G encodes a solution path. As in 
the discrete case, if there is a single search tree, then this is trivial; otherwise, 
it can become expensive. 

6. Return to Step 2: Iterate unless a solution has been found or some ter- 
mination condition is satisfied, in which case the algorithm reports failure. 

In the present context, G is a topological graph, as defined in Example 4.1.6. 
Each vertex is a configuration and each edge is a path that connects two config- 
urations. In this chapter, it will be simply referred to as a graph when there is 
no chance of confusion. Some authors might refer to such a graph as a roadmap; 
however, we reserve the term roadmap for a graph that contains enough paths to 
make any motion planning query easily solvable. This case is covered in Section 
5.6 and throughout Chapter 6. 

A large family of sampling-based algorithms can be described by varying the 
implementations of Steps 2 and 3. Implementations of the other steps may also 
vary, but this is less important and will be described where appropriate. For 
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Figure 5.14: Imagine a problem in which the configuration space obstacle is a giant 
"bowl" that can trap the configuration. This figure is drawn in two dimensions, 
but imagine that the C has many dimensions, such as 6 for SE(3) or perhaps 
dozens for a linkage. If the discrete planning algorithms from Section 2.3 are 
applied to a high-resolution grid approximation of C, then they will all waste their 
time filling up the bowl before being able to escape to q g . The number of grid 
states in this bowl would typically be on the order of 100 n , for an ri-dimensional 
configuration space. 

convenience, Step 2 will be called the Vertex Selection Method (VSM) and Step 3 
will be called the Local Planning Method (LPM). The role of the VSM is similar to 
that of the priority queue, Q in Section 2.3.1. The role of the LPM is to compute 
a collision-free path segment that can be added to the graph. It is called local 
because the path segment is usually simple (e.g., the shortest path) and travels a 
short distance. It is not global in the sense that the LPM does not try to solve the 
entire planning problem; it is expected that the LPM may often fail to construct 
path segments. 

It will be formalized shortly, but imagine for the time being that any of the 
search algorithms from Section 2.3 may be applied to motion planning by ap- 
proximating C with a high-resolution grid. The resulting problem looks like a 
multidimensional extension of Example 2.2.1 (the "labyrinth" is formed by C b s ). 
For a high-resolution grid in a high-dimensional space, most classical discrete 
searching algorithms have trouble becoming trapped in a local minimum. There 
could be an astronomical number of states that fall within a concavity in C b. s 
that must be escaped to solve the problem, as shown in Figure 5.14. Therefore, 
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sampling-based motion planning algorithms combine sampling and searching in a 
way that attempts to overcome these kinds of difficulties. 

Just as in the case of discrete search algorithms, there are several classes of 
algorithms based on the number of search trees. 

Unidirectional (single tree) methods: In this case, the planning appears 
very similar to discrete forward search, which was given in Figure 2.5. The 
main difference between algorithms in this category is how they implement 
the VSM and LPM. Figure 5.15 shows a bug trap 6 example for which forward 
search algorithms will have great trouble; however, the problem might not 
be difficult for backwards search, if the planner incorporates some kind of 
greedy, best-first behavior. 

Bidirectional (two tree) methods: Since it is not known whether or not 
qi or q g might lie in a bug trap (or another challenging region), a bidirec- 
tional approach is often preferable. This follows from an intuition that two 
propagating wavefronts, one centered on qi and the other on q g , will meet 
after covering less area in comparison to a single wavefront centered at qi 
that must arrive at q g . A bidirectional search is achieved by defining the 
VSM to alternate between trees when selecting nodes. The LPM sometimes 
generates paths that explore new parts of Cf ree , and at other times it tries 
to generate a path that connects the two trees. 

Multidirectional (more than two trees) methods: If the problem is 
so bad that a double bug trap exists, as shown in Figure 5.16, then it might 
make sense to grow trees from other places in the hopes that there are 
better chances to enter the traps in the other direction. This complicates 
the problem of connecting trees, however. For which pairs should attempts 
be made to connect? How often should these attempts be made? Which 
vertex pairs should be selected. Many heuristic parameters may be needed 
to answer these questions. 

Of course, we can play the devil's advocate and construct the example in Figure 
5.17, for which virtually all sampling-based planning algorithms are doomed. Sev- 
eral variations can also be made. For example, the connecting pipe could have a 
small hold in it; this does not help. The two bug traps could even be disconnected, 
as long as the entrance to each is hard to find. 

5.4.2 Adapting Classical Search Algorithms 

One of the most convenient and straightforward ways to make sampling-based 
planning algorithms is to define a grid over C and conduct a discrete search using 
the algorithms of Chapter 2. The resulting planning problem actually looks very 

6 This principle is actually used in real life to trap flying bugs. This example and analogy 
was suggested by James O'Brien in a discussion with James Kuffner. 
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Figure 5.15: This example, again in high dimensions, can be considered as a kind 
of "bug trap". To leave the trap, a path must be found from qi into the narrow 
opening. Imagine a fly buzzing around through the high-dimensional trap. The 
escape opening might not look too difficult in two dimensions, but if it has a small 
range with respect to each configuration parameter, it will be nearly impossible to 
find the opening. The tip of the volcano would be astronomically small compared 
to the rest of the bug trap. Examples such as this provide some motivation for 
bidirectional algorithms. It might be easier for a search tree that starts in q g to 
arrive in the bug trap. 
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Figure 5.16: The double bug trap is trouble even for bidirectional search. This 
may motivate the construction of more than two trees. 




Figure 5.17: A multidimensional search cannot even help with this example, which 
involves two bug traps connected by a thin tube. We must accept the fact that 
some problems are hopeless to solve using sampling-based planning methods, un- 
less there is some problem-specific structure that can be additionally exploited. 
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similar to Example 2.2.1. Each edge now corresponds to a path in C/ ree . Some 
edges may not exist because of collisions, but this will have to be revealed during 
the search because an explicit characterization of C b s is to expensive to construct 
(recall Section 4.3). 

Assume that an n-dimensional configuration space is represented as a unit 
cube, C = [0, l] n / ~, in which ~ indicates that identifications of the sides of the 
cube are made to reflect the C-space topology. Representing C as a unit cube 
usually requires a reparameterization. For example, an angle 9 G [0, 2n) would be 
replaced with 6 /2n to make the range lie within [0,1]. If quaternions are used for 
SO(3), then the upper half of S 3 will have to be deformed into [0, l] 3 / ~. 

Discretization Assume that C is discretized by using the resolutions ki, k2,. ■ ., 
and k d , in which each fc, is a positive integer. This allows the resolution to be 
different for each C-space coordinate Either a standard grid or a Sukharev grid 
can be used. Let Aqi = [0 • • • ^ • ■ ■ 0] . A grid point is a configuration 
q G C that can be expressed in the form 7 

n 

J>Aft, (5.34) 

i=i 

in which each ji G {0, 1, ... , ki}. The integers ji, . . ., j n can be imagined as array 
indices for the grid. Let the term boundary grid point refer to a grid point that 
has ji = or ji = ki for some i. Note that due to identification, boundary grid 
points might have more than one representation. 

Neighborhoods For each grid point, q, we need to define the set of nearby 
grid points for which an edge may be constructed. Special care must be given to 
defining the neighborhood of a boundary grid point to ensure that identifications 
and the C-space boundary (if it exists) are respected. If q is not a boundary grid 
point, then the 1 -neighborhood is defined as 

JVi(g) = {q + Aq u ...,q + Aq n , q - A Ql , ...,q- Aq n }. (5.35) 

For an n-dimensional configuration space there at most 2n 1-neighbors. In two 
dimensions, this yields 4 1-neighbors, which can be thought of as "up" , "down" , 
"left" and "right" . We say "at most" because some directions may be blocked by 
the obstacle region. 

A 2-neighborhood is defined as 

N 2 {q) = {q±Aqi±A qj | 1 < i, j < n, i± j} U N^q). (5.36) 

Similarly, a k-neighborhood can be defined for any positive integer k < n. For 
a n-neighborhood, there are at most 3 n — 1 neighbors; there may be fewer due 
to collisions. The definitions can be extended in a straightforward to handle the 
boundary points. 

"Alternatively, the general lattice definition in (5.20) could be used. 
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Figure 5.18: A topological graph can be constructed during the search, and can 
successfully solve a problem using very few samples. 
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Obtaining a discrete planning problem Once the grid and neighborhoods 
have been defined, it is straightforward to define a discrete planning problem. 
Figure 5.18 depicts the process for a problem in which there are 9 Sukharev grid 
points in [0, l] 2 . Using 1-neighborhoods, the potential edges in the search graph, 
G(V,E), appear in Figure 5. 18. a. Note that G is a topological graph, as defined 
in Example 4.1.6 because each vertex is a configuration and each edge is a path. 
If qi and q g do not coincide with grid points, they need to be connected to some 
nearby grid points, as shown in Figure 5.18.b. What grid points should qi and q g 
be connected to? As a general rule, if fc-neighbors are used, then one should try 
connecting q { and q g to any grid points that are at least as close as the furthest 
k- neighbor from a typical grid point. 

Usually, all of the vertices and edges shown in Figure 5. 18. a will not appear 
in G because some will intersect with C ofes . Figure 5.18.C shows a more typical 
situation, in which some of the potential vertices and edges are removed because of 
collisions. This representation could be computed in advance by collision checking 
all potential vertices and edges. This would lead to a roadmap, which is suited 
for multiple queries, and is covered in Section 5.6. In this section, it is assumed 
that G is revealed "on the fly" during the search. This is the same situation that 
occurs for the discrete planning methods from Section 2.3. In the current setting, 
the potential edges of G are validated during the search. The candidate edges to 
evaluate are given by the definition of the k neighborhoods. During the search, 
any edge or vertex that has been checked for collision explicitly appears in a data 
structure so that it does not need to be checked again. At the end of the search, 
a path is found, as depicted in Figure 5.18.d. 

Grid resolution issues The method explained so far will nicely find the solu- 
tion to many problems, when provided with the correct resolution. If the number 
of points per axis is too high, then the search may be too slow. This motivates 
selecting fewer points per axis, but then solutions might be missed. This problem 
is fundamental to sampling-based motion planning. In a more general setting, if 
other forms of sampling and neighborhoods are used, then enough samples have 
to be generated to yield the right dispersion. 

There are two general ways to avoid having to select this resolution (or more 
generally, dispersion) : 

1. Iteratively refine the resolution until a solution is found. In this case, sam- 
pling and searching become interleaved. One important variable is how 
frequently to alternative between the two processes, this will be presented 
shortly. 

2. An alternative is to abandon the adaptation of classical discrete search al- 
gorithms, and develop algorithms directly for the continuous problem. This 
forms the basis of the methods in Sections 5.4.3, 5.4.4, and 5.5. 
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The most straightforward approach is to iteratively improve the grid resolution. 
Suppose that initially, a standard grid with 2 n points total and 2 points per axis 
is searched using one of the discrete search algorithms, such as best-first or A*. If 
the search fails, what should be done? One possibility is to double the resolution, 
which yields a grid with 4 n points. Many of the edges can be reused from the 
first grid; however, this savings diminishes rapidly in higher dimensions. Once the 
resolution is doubled, the search can be applied again. If it fails again, then the 
resolution can be doubled again to yield 8 n points. In general, there would be a 
full grid for 2 m points, for each i. The problem is that if n is large, then the rate 
of growth is too large. For example, if n = 10, then there would initially be 1024 
points; however, when this fails, the search is not performed again until there are 
over one million points! If this also fails, then it might take a very long time to 
reach the next level of resolution, which has 2 30 points. 

An similar to iterative deepening from Section 2.3.2 would be preferable. Sim- 
ply discard the efforts of the previous resolution, and make grids that have i n 
points per axis, for each iteration %. This will yield grids of sizes 2™, 3™, 4 n , etc., 
which is much better. The amount of effort involved in searching a larger grid is 
insignificant compared to the time wasted on lower resolution grids. Therefore, it 
seems harmless to discard previous work. 

A better solution is not to require that a complete grid exists before it can 
be searched. For example, the resolution can be increased for one axis at a time 
before attempting to search again. Even better yet may be to tightly interleave 
searching and sampling. For example, imagine that the samples appear as an 
infinite, dense sequence a. The graph can be searched after every 100 points are 
added, assuming that neighborhoods can be defined or constructed even though 
the grid is only partially completed. If the search is performed too frequently, then 
searching this would dominate the running time. An easy way make this efficient is 
to apply the union-find algorithm [176, 655] to iteratively keep track of connected 
components in G instead of performing explicit searching. If ^ and q g become part 
of the same connected component, then a solution path has been found. Every 
time a new point in the sequence a is added, the "search" is performed in almost 8 
constant time by the union-find algorithm. This is the tightest interleaving of 
the sampling and searching, and results in a nice sampling-based algorithm that 
requires no resolution parameter. It is perhaps best to select a sequence a that 
contains some lattice structure to facilitate the determination of neighborhoods 
in each iteration. 

What if we simply declare the resolution to be outrageously high at the outset? 
Imagine there are 100 n points in the grid. This places all of the burden on the 
search algorithm. If the search algorithm itself is good at avoiding local minima 
and has built-in multiresolution qualities, then it may perform well without the 

8 It is not constant because the running time includes the inverse Ackerman function, which 
grows very, very slowly. For all practical purposes, the algorithm operates in constant time. See 
Section ??. 
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iterative refinement of the sampling. The method of Section 5.4.3 is based on 
this idea by performing best-first search on a high-resolution grid, combined with 
random walks to avoid local minima. The search algorithms of Section 5.5 go one 
step further and search in a multiresolution way without requiring resolutions and 
neighborhoods to be explicitly determined. This can be considered as the limiting 
case as the number of points per axis approaches infinity. 

Although this section focused on grids, it is also possible to use other forms of 
sampling from Section 5.2. This requires defining the neighborhoods in a suitable 
way that generalizes the /c-neighborhoods of this section. In every case, an infinite, 
dense sample sequence must be defined to obtain dispersion completeness. Meth- 
ods for obtaining neighborhoods for irregular sample sets have been developed in 
the context of sampling-based roadmaps; see Section 5.6. The notion of improv- 
ing resolution becomes generalized to adding samples that improve dispersion (or 
even discrepancy). 

Notions of completeness It is useful to define several notions of completeness 
for sampling-based algorithms. An algorithm is considered complete if for any 
input it correctly reports whether or not there is a solution in a finite amount of 
time. If there is a solution, it must return it. Unfortunately, completeness cannnot 
be achieved with sampling-based planning. If a is a deterministic, dense sequence, 
then the refinement scheme described so far produces a dispersion complete al- 
gorithm. This means that if a solution exists, then the algorithm will find it; 
however, if no solution exists, then the algorithm will run forever. If is terminates 
early without finsding a solution, it may declare that either no solution exists, or 
if the solution exists, it requires sampling with a smaller dispersion. This implies 
that the path must travel through a narrow passage. A special case of dispersion 
completeness is when a multiresolution grid or lattice is used. In this case, an 
algorithm may be called resolution complete. Finally, if a is a random sequence 
that is dense with probability one, then the resolution algorithm is probabilistically 
complete. This means that with enough points, the probably that it will find a 
solution converges to one. The most relevant information, however, is the rate at 
which the convergence occurs. This is usually very difficult to establish. 

5.4.3 Randomized Potential Fields 

Adapting the classical algorithms, as described in Section 5.4.2, works well if the 
problem can be solved with a small number of points. The number of points per 
axis must be small or the dimension must be low, to ensure that the number of 
points, k n , for k points per axis and n dimension, is small enough so that every 
vertex in g can be reached in a reasonable amount of time. If, for example, the 
problem requires 50 points per axis and the dimension is 10, then it is impossible 
to search all of the 50 10 samples. Planners that exploit best-first heuristics might 
find the answer without searching most of them; however, for a simple problem 
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Initialization (i=1) 




Figure 5.19: The randomized potential field method can be modeled as a three- 
state machine. 

such as that shown in Figure 5.14, the planner will take too long exploring the 
nodes in the bowl. 9 

The randomized potential field approach uses random walks to attempt to es- 
cape local minima when best-first search becomes stuck [51, 53, 437], was one of 
the first sampling-based planners that developed specialized techniques beyond 
classical search, in an attempt to better solve challenging motion planning prob- 
lems. In many cases, remarkable results were obtained. In its time, the approach 
was able to solve problems up to 31 degrees of freedom, which was well beyond 
what had been previously possible. The main drawback, however, was that the 
method involved many heuristic parameters that had to be adjusted for each 
problem. This frustration eventually led to the development of better approaches, 
which are covered in Sections 5.4.4, 5.5, and 5.6. Nevertheless, it is worthwhile to 
study the clever heuristics involved in this earlier method because they illustrate 
many interesting issues, and the method was very influential in the development 
of other sampling-based planning algorithms. 10 

The most complicated part of the algorithm is the definition of a potential 
function, which can be considered as a pseudometric that tries to estimate the 
distance of any configuration from the goal. In most formulations, there is an 
attractive term that is just a metric on C which yields distance to the goal, and 
a repulsive term, which penalizes robot as it gets too close to obstacles. The 
construction of potential functions involves many heuristics and is covered in great 
detail in [437]. One of the most effective methods involves constructing cost-to-go 
functions over W and lifting them to C [52]. In this section, it will be sufficient to 
assume that some potential function, g(q), is defined, which is the same notation 
(and notion) as a cost-to-go function in Section 2.3.2. In this case, however, there 
is no requirement that g(q) is optimal or even an underestimate of the true cost 
to go. 

When a random walk is needed, it is executed for some number of iterations. 

9 Of course, that problem docs not appear to need so many points per axis; fewer may be 
used instead, if the algorithm can adapt the sampling resolution or dispersion. 

10 The exciting results obtained by the method also helped inspire me to work in motion 
planning. 
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Using the discretization procedures of Section 5.4.2, a high-resolution grid (e.g., 50 
points per axis) is initially defined. In each iteration, the current configuration is 
modified as follows. Each coordinate, g«, is increased or decreased by (the grid 
step size) based on the outcome of a fair coin toss. Topological identifications must 
be respected, of course. After each iteration, the new configuration is checked for 
collision, or whether it exceeds the boundary of C (if it has a boundary). If so, 
then it is discarded and another attempt is made from the previous configuration. 
The failures can repeat indefinitely until a configuration in C/ ree is obtained. 

The resulting planner can be described in terms of a three-state machine, which 
is shown in Figure 5.19. Each state will be called a mode to avoid confusion with 
earlier state space concepts. The VSM and LPM are defined in terms of the mode. 
Initially, the planner is in the best first mode, and uses qi to start a gradient 
descent. While in the best first mode, the VSM selects the newest vertex, 
v e V. In the first iteration, this is qi. The LPM creates a new vertex, v n , in a 
neighborhood of v, in a direction that minimizes g. The direction sampling may 
be performed using randomly-selected or deterministic samples. Using random 
samples, the sphere sampling method from Section 5.2.2 may be applied. The 
method for generating random samples from 5.2.2 can be used. After some number 
of tries (another parameter), if the LPM is unsuccessful at reducing g, then the 
mode is changed to random walk because the best first search is stuck in a local 
minimum. 

In the RANDOM WALK mode, a random walk is executed from the newest node. 
The random walk terminates if either g is lowered, or a specified limit of iterations 
is reached. The limit is actually sampled from a predetermined random variable 
(which contains parameters that also must be selected). When the random 
walk mode terminates, the mode is changed back to best first. A counter 
is incremented to keep track of the number of times that the random walk was 
attempted. If best first fails after K random walks have been attempted, then 
the backtrack mode is entered. The K is another parameter (a typical value 
is K = 20 [52]). The backtrack mode selects a vertex at random from among 
the vertices in V there were obtained during a random walk. Following this, the 
counter is reset, and the mode is changed back to best first. 

Due to the random walks, the resulting paths are often too complicated to be 
useful in applications. Fortunately, it is straightforward to transform a computed 
path into a simpler one that is still collision free. A common approach is to 
iteratively pick pairs of points at random along the domain of the path, and 
attempt to replace the path segment with a straight-line path (or geodesic). For 
example, suppose t\,t2 £ [0, 1] are chosen at random and r : [0, 1] — > Cf ree is the 
solution path. This path is transformed into a new path 



in which a G [0, 1] represents the fraction of the way from t\ to t 2 . Explicitly, 




(5.37) 
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a = (£2 — t)/(t2 — ti). The new path must be checked for collision. If it passes, 
then it replaces the old path; otherwise, it is discarded and a new pair ti, t 2 , is 
chosen. 

The randomized potential field approach can escape high-dimensional local 
minima, which allowed interesting solutions to be found for many challenging 
high-dimensional problems. Unfortunately, the heavy amount of parameter tuning 
caused most people to abandon the method in recent times, in favor of newer 
methods. 

5.4.4 Other Methods 

Several influential sampling-based methods are given here. Each of them appears 
to offer advantages over the randomized potential field method. All of them use 
randomization, which was perhaps inspired by the potential field method. 

Ariadne's Clew algorithm This approach grows a search tree that is biased 
to explore as much new territory as possible in each iteration [544, 543]. There are 
two modes, search and explore, which alternate over successive iterations. In 
the explore mode, the VSM simply selects a vertex, v e , at random, and the LPM 
finds a new configuration that can be easily connected to v e , and is a far as possible 
from the other vertices in G. A global optimization function that aggregates the 
distances to other vertices is optimized using a genetic algorithm. In the search 
mode, an attempt is made to extend the vertex added in the explore mode to the 
goal configuration. The key idea from this approach, which influenced both next 
approach and the methods in Section 5.5 is that some of the time must be spend 
exploring the space, as opposed to focusing on finding the solution. The greedy 
behavior of the randomized potential field led to some efficiency, but was also its 
downfall for some problems because it was all based on escaping local minima 
with respect to the goal instead of investing some time on pure exploration. One 
disadvantage of Ariadne's Clew algorithm is that it is very difficult to solve the 
optimization problem for placing a new vertex in the explore mode. Genetic 
algorithms were used in [543], which are generally avoided for motion planning 
because of the required problem-specific parameter tuning. 

Expansive space planner This method [344, 670] generates samples in a way 
that attempts to explore new parts of the space. In this sense, it is similar to the 
explore mode of the Ariadne's Clew algorithm. Furthermore, the planner is made 
more efficient by borrowing the bidirectional search idea from discrete algorithms, 
as covered in Section 2.3.3. The VSM selects a vertex, v e , in G with a probability 
that is inversely proportional to the number of other vertices of G that lie within a 
predetermined neighborhood of v e . Thus, "isolated" vertices are more likely to be 
chosen. The LPM generates a new vertex v n at random within a predetermined 
neighborhood of v e . It will decide to insert v n into G with a probability that 
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is inversely proportional to the number of other vertices of G that lie within a 
predetermined neighborhood of v n . For a fixed number of iterations, the VSM 
will repeatedly choose the same vertex, until moving on to another vertex. The 
resulting planner is able to solve many interesting problems by using a surprisingly 
simple criterion for the placement of points. The main drawbacks are that the 
planner requires substantial parameter tuning which is problem specific (or at least 
specific to a similar family of problems), and the performance tends to degrade 
if the query requires systematically searching a long labyrinth. Choosing the 
radius of the predetermined neighborhoods is essentially tries to determine the 
appropriate resolution. 

Random walk planner A surprisingly simple and efficient algorithm can be 
made entirely from random walks [127]. To avoid parameter tuning, the algorithm 
adjusts its distribution of directions and magnitude in each iteration, based on 
the success of the past k iterations (perhaps k is the only parameter). In each 
iteration, the VSM just selects the vertex that was most recently added to G. 
The LPM generates a direction and magnitude by generating samples from a 
multivariate Gaussian distribution whose covariance parameters are adaptively 
tuned. The main drawback of the method is similar to that of the previous 
method. Both have difficulty traveling through long, winding corridors. It would 
be interesting to combine adaptive random walks with other search algorithms, 
such as the potential field planner, but this has not been attempted to date. 



5.5 Rapidly- Exploring Dense Trees 

This section introduces an incremental sampling and search approach that yields 
good performance in practice without any parameter tuning. 11 The idea is to 
incrementally construct a search tree that gradually improves the resolution, but 
does not need to explicitly set any resolution parameters. In the limit, the tree 
will densely cover the space. Thus, it has properties similar to space filling curves 
[668], but instead of one long path, there are shorter paths that are organized 
into a tree. A dense sequence of samples is used as a guide in the incremental 
construction of the tree. If this sequence is random, the resulting tree will be called 
a Rapidly- exploring Random Tree (RRT). In general, this family of trees, whether 
the sequence is random or deterministic, will be referred to as Rapidly- exploring 
Dense Trees (RDTs) to indicate that a dense covering the space is obtained. 
This method was originally developed for problems with differential constraints 
[463, 466]; that case is covered in Section 15.3.3. 

11 The original RRT [449] was introduced with a step size parameter, but this is eliminated in 
the current presentation. For implementation purposes, one might still want to revert to this 
older way of formulating the algorithm because the implementation is a little easier. This will 
be discussed shortly. 
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SIMPLE RDT(g ) 

1 G.init(go); 

2 for % = 1 to k do 

3 G.add_vertex(a(i)); 

4 q n <— NEAREST (S,a(i)); 

5 G.add_edge(g ri , a{i)); 



Figure 5.20: The basic algorithm for constructing RDTs (including RRTs) when 
there are no obstacles. It requires the availability of a dense sequence, a, and 
iteratively connects from a(i) to the closest point among all those reached by G. 




Figure 5.21: Suppose inductively that the following tree has been constructed so 
far using the algorithm in Figure 5.20. 

5.5.1 The Exploration Algorithm 

Before explaining how to use these trees to solve a planning query, imagine that 
the goal is to get as close as possible to every configuration, starting from an 
initial configuration. The method will work for any dense sequence. Therefore, 
let a denote an infinite, dense sequence of samples in C. The i th sample is de- 
noted by a(i). Let this also include a uniform, random sequence, which is dense 
with probability one. Random sequences that induce a nonuniform bias are also 
acceptable, as long as they are dense with probability one. 

An RDT will actually be a topological graph, G(V, E). Let S C Cf ree indicate 
the set of all points reached by G. Since each e G E is a path, this can be expressed 
as 



in which e([0, 1]) C C/ ree is the image of the path e. 

The exploration algorithm is first explained in Figure 5.20 without any obsta- 
cles or boundary obstructions. It is assumed that C is a metric space. Initially, a 
vertex is made at go- For k iterations, a tree is iteratively grown by connecting 
a(i) to its closest point on S. The connection is usually made along the shortest 
possible path. In every iteration, a(i) becomes a vertex. Therefore, the resulting 
tree is dense. Figures 5.21-5.23 illustrate an iteration graphically. Suppose the 
tree has 3 edges and 4 vertices, as shown in Figure 5.21. If the nearest point, 
q n G S, to a(i) is a vertex, as shown in Figure 5.22, then an edge is made from 




(5.38) 
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Figure 5.23: If the nearest point S lies in an edge, then the edge is split into two, 
and a new vertex is inserted into G. 




45 iterations 390 iterations 

Figure 5.24: The RRT quickly reaches the unexplored parts. 



220 



S. M. LaValle: Planning Algorithms 




2345 iterations 

Figure 5.25: The RRT is dense in the limit (with probability one). 
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Figure 5.26: If there is an obstacle, the edge travels up to the obstacle boundary, 
as far as allowed by the collision detection algorithm. 

q n to a(i). However, if the closest point lies in the interior of an edge, as shown 
in Figure 5.23, then the existing edge is split so that q n appears as a new vertex, 
and an edge is made from q n to a(i). 

The method as described here does not fit precisely under the general frame- 
work from Section 5.4.1; however, with modifications suggested in Section 5.5.2, 
it can be adapted to fit. In the present formulation, the nearest functions serves 
the purpose of the VSM, but in this case, a point may be selected from anywhere 
in the interior of an edge, in addition to a vertex. The LPM tries to connect a(i) 
to q n along the shortest path possible in C. 

Figures 5.24 and 5.25 show an implementation of the algorithm in Figure 5.20 
for the case in which C = [0, l] 2 , and go — (1/2, 1/2). It exhibits a kind of fractal 
behavior. 12 Several main branches are first constructed as it rapidly reaches the 
far corners of the space. Gradually, more and more area is filled in by smaller 
branches. From the pictures, it is clear that in the limit, the tree will densely fill 
the space. Thus, it can be seen that the tree gradually improves the resolution 
(or dispersion) as the iterations continue. This behavior turns out to be ideal for 
sampling-based motion planning. 

Recall that in sampling-based motion planning, the obstacle region C t, s is not 
explicitly represented. Therefore, it must be taken into account in the construc- 
tion of the tree. Figure 5.26 indicates how to modify the algorithm in Figure 5.20 
so that collision checking is taken into account. The pseudocode for the modi- 



12 If a is uniform, random, then a stochastic fractal [435] is obtained. Deterministic fractals 
can be constructed using sequences that have appropriate symmetries. 
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RDT(go) 

1 G.init(go); 

2 for i = 1 to k do 

3 q n <— NEAREST^, a{i)); 

4 g s <— STOPPING-CONFIGURATION(g n ,Q;(i)); 

5 if q s 7^ g n then 

6 G.add_vertex(g s ); 

7 G.add_edge(g„,g s ); 



Figure 5.27: The RDT with obstacles. 

fied algorithm appears in Figure 5.27. The procedure stopping-configuration 
yields the closest configuration possible to the boundary of C/ ree , along the direc- 
tion toward a(i). The closest point q n G S is defined to be same (obstacles are 
ignored); however, the new edge might not reach to a(i). In this case, an edge is 
made from q n to q s , the last point possible before hitting the obstacle. How close 
can the edge come to the obstacle boundary? This depends on the method used 
to check for collision, as explained in Section 5.3.4. It is sometimes possible that 
q n is already as close as possible to the boundary of C/ ree in the direction of a(i). 
In this case, no new edge or vertex is added that for that iteration. 

5.5.2 Efficiently Finding Nearest Points 

There are several interesting alternatives for implementing the nearest function 
in Line 3 of the algorithm in Figure 5.20. There are generally two families of 
methods: exact or approximate. First consider the exact case. 

Exact solutions Suppose that all edges in G are line segments in M. m for some 
dimension m > n. An edge that is generated early in the construction process will 
be split many times in later iterations. For the purposes of finding the nearest 
point in S; however, it is best to handle this as a single segment. For example, 
see the three large branches that extend from the root in Figure 5.24. As the 
number of points increases, the benefit of agglomerating the segments increases. 
Let each of these agglomerated segments be referred to as a supersegment. To 
implement nearest, a primitive is needed that computes the distance between 
a point and a line segment. This can be performed in constant time with simple 
vector computations. Using this primitive, nearest is implemented by iterating 
over all of the supersegments and taking the point with minimum distance among 
all of them. It may be possible to improve performance by building hierarchical 
data structures that can eliminate large sets of supersegments, but this remains 
to be seen experimentally. 

In some cases, the edges of G may not be line segments. For example, the 
shortest paths between two points in SO (3) are actually circular arcs along § 3 . 
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Figure 5.28: For implementation ease, intermediate vertices can be inserted to 
avoid checking for close points along line segments. The tradeoff is that the 
number of vertices is increased. 

One possible solution is to maintain a separate parameterization of C for the 
purposes of computing the nearest function. For example, 5*0(3) can be rep- 
resented as [0, l] 3 / ~, by making the appropriate identifications to obtain MP 3 . 
Then straight line segments can be used. The problem is that the resulting met- 
ric is not consistent with the Haar measure, which means that an accidental bias 
would result. Another option is to tightly enclose § 3 in a 4D cube. Every point on 
S 3 can be mapped outward onto a cube face. Because of antipodal identification, 
only 4 of the 8 cube faces need to be used to obtain a bijection between the set 
of all rotation and the cube surface. Linear interpolation can be used along the 
cube faces, as long as both points remain on the same face. If the points are on 
different faces, then two line segments can be used by bending the shortest path 
around the corner between the two faces. This scheme will result in less distortion 
than mapping 5*0(3) to [0, l] 3 / ~; however, some distortion will still exist. 

Another approach is to avoid distortion altogether and implement primitives 
that can compute the distance between a point and a curve. In the case of 50(3), 
a primitive is needed that can find the distance between a circular arc in M m 
and a point in M m . This might not be too difficult, but if the curves are more 
complicated, then an exact implementation of the nearest function may be too 
expensive computationally. 

Approximate solutions Approximate solutions are much easier to construct, 
however, a resolution parameter is introduced. Each path segment can be approx- 
imated by inserting intermediate vertices along long segments, as shown in Figure 
5.28. The intermediate vertices should be added each time a new sample, a(i), is 
inserted into G. A parameter Aq can be defined, and intermediate samples are 
inserted to ensure that no two consecutive vertices in G are ever further than Aq 
from each other. Using intermediate vertices, the interiors of the edges in G are 
ignored when finding the nearest point in 5. The approximate computation of 
nearest is performed by finding the closest vertex to ct(i) in G. This approach 
is by far the simplest to implement (in fact, it was done to obtain the results in 



224 



S. M. LaValle: Planning Algorithms 




Figure 5.29: The Kd-tree can be used for efficient nearest neighbor computations. 



Figure 5.24). It also fits precisely under the incremental sampling and searching 
framework from Section 5.4.1. 

When using intermediate vertices, the tradeoffs are clear. The computation 
time for each evaluation of nearest is linear in the number of vertices. Increas- 
ing the number of vertices improves the quality of the approximation, but also 
dramatically increases running time. One way to recover some of this cost of 
the insert the vertices into an efficient data structure for nearest-neighbor search- 
ing. One of the most practical and widely-used data structures is the Kd-tree 
[189, 263, 599]. A depiction is shown in Figure 5.29 for 14 points in R 2 . The 
Kd-tree can be considered as a multidimensional generalization of a binary search 
tree. The Kd-tree is constructed for points, P, in R 2 as follows. Initially, sort 
the points with respect to the X coordinate. Take the median point, p G P, and 
divide P into two sets depending on which side of a vertical line through p the 
other points fall. For each of the two sides, sort the points by the Y coordinate, 
and find the medians. Points are divided at this level based on whether they are 
above or below horizontal lines. At the next level of recursion, vertical lines are 
used again, followed by horizontal again, and so forth. The same idea can be ap- 
plied in R™ by cycling through the n coordinates, instead of alternating between 
X and Y, to form the divisions. In [32], the Kd-tree is extended to topological 
spaces that arise in motion planning, and is shown to yield good performance for 
RRTs and sampling-based roadmaps. The Kd-tree can be constructed in 0(n lg k) 
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time. The topology must be carefully considered when traversing the tree. When 
a query is made, a point, q G T, is given, and the closest point to q is found. At 
first the query algorithm descends to a leaf node which contains the query point, 
finds all distances from the data points in this leaf to the query point, and picks 
up the closest one. Then, it recursively visits those surrounding leaf nodes which 
are further from the query point than the closest point found so far [32]. The 
nearest point can be found in time logarithmic in k. 

Unfortunately, these bounds hide a constant that increases exponentially with 
n. In practice, the Kd-tree is useful in motion planning for problems of up to 
about 20 dimensions. After this, the performance usually degrades too much. As 
an empirical rule, if there there are more than 2 n points, then the Kd-tree should 
be more efficient than naive nearest neighbors. In general, the tradeoffs must 
be carefully considered in a particular application to determine whether exact 
solutions, approximate solutions with naive nearest neighbor computations, or 
approximate solutions with Kd-trees will be more efficient. There is also the issue 
of implementation complexity, which probably has caused most people to prefer 
the approximate solution with naive nearest neighbor computations. 

5.5.3 Using the Trees for Planning 

So far, the discussion has focused on exploring C/ ree , but this does not solve a 
planning query by itself. There are many ways that RRTs and RDTs in general 
can be used in planning algorithms. For example, they could be used to escape 
local minima in the randomized potential field planner of Section 5.4.3. 

Single-tree search A reasonably efficient planner can be made by directly using 
the algorithm in Figure 5.27, if the sequence a contains the appropriate bias. 
If the sample sequence is random, which generates an RRT, then the following 
modification will work well. In each iteration, toss a biased coin that has probably 
49/50 of being heads, and 1/50 of being tails. If the result is heads, then 
set a(i), to be the next element of the pseudorandom sequence. Otherwise, set 
ct(i) = q g . This will force the RDT to occasionally attempt making a connection 
to the goal, q g . Of course, 1/50 is arbitrary, but it in a range that works well 
experimentally. If the bias is too strong, then the RDT will become too greedy 
like the randomized potential field. If the bias is not strong enough, then there 
will be no incentive to connect the tree to q g . 

If a(i) is a deterministic sequence, then q g can be selected with a fixed fre- 
quency. For example, the Halton sequence can be used, but for every positive 
integer i, q g is inserted into the Halton sequence between points 50i and 50i + 1. 
Thus, in every 50 th iteration, the RDT will attempt to connect to the goal. Of 
course, the fixed frequency could also be combined with the random sampling. 

Other variations can be made by using a dense, but nonuniform sequence in 
C. For example, in the case of random sampling, the probability density function 
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RDT 


_BALANCED_BIDIRECTIONAL(gj, q Q ) 


1 


T .init(gj); T b .imt(q g ); 


2 


for i = 1 to K do 


3 


q„ <— NEARESTf^o, ct(i)); 


4 


O s <— STOPPING-CONFIGURATION(o n ,a(i)): 

J>j \J. lt 'l\//i 


5 


if o ^ q> then 


6 


T a . add .vertex ( g s ) ; 


7 


T^.add edged?,,, qO; 


8 


q' <- NEAREST^, g s ); 


g 


rJ < — STOPPTNn-CONFTnTIR ATTONfr/' a V 


10 


if 9s ^ 9n then 


11 


T;,.add_vertex(^); 


12 


T 6 .add_edge(g;,g;); 


13 


if q' s = q s then Return Solution; 


14 


if\T b \ > \T a \ then SWAP(T a ,T 6 ); 


15 


Return Failure 



Figure 5.30: A bidirectional RDT-based planner. 

could contain a gentle bias towards the goal. Choosing such a bias is a difficult 
heuristic problem; therefore, such a technique should be used with caution (or 
avoided altogether). 

Balanced, bidirectional search 13 

Much better performance can usually be obtained by growing two RDTs, one 
from qi and the other from q g . This is particularly valuable for escaping one of the 
bug traps, as mentioned in Section 5.4.1. For a grid search, it is straightforward 
to implement a bidirectional search that ensures that the two trees meet. For the 
RDT, the special considerations must be made to ensure that the two trees will 
connect while retaining their rapidly-exploring property. One additional idea is 
to make sure that the bidirectional search is balanced [], which will ensure that 
both trees are the same size. 

Figure 5.30 gives an outline of the algorithm. The graph, G, is decomposed 
into two trees, denoted by T a and T b . Initially, these trees start from and q g , 
respectively. After some iterations, T a and T b will be swapped; therefore, keep in 
mind that T a is not always the tree that contains qi. In each iteration, T a is grown 
exactly the same way as in one iteration of the algorithm in Figure 5.20. If a new 
vertex, q s is added to T a , then an attempt is made in Lines 10-12 to extend T b . 
Rather than using a(i) to extend T b , the new vertex, q s , of T a is used. This will 
cause T b to try to grow towards T a . If the two connect, which is tested in Line 13, 
then a solution has been found. 



This particular planner is due to an unpublished collaborative effort with James Kuffncr. 
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Line 14 represents an important step that balances the search. This is partic- 
ularly important for a problem such as the bug trap shown in Figure 5.15 or the 
puzzle shown in Figure 1.2. If one of the trees is having trouble exploring, then 
it makes sense to focus more energy on it. Therefore, new exploration is always 
performed for the smaller tree. How is "smaller" defined? A simple criterion is to 
use the total number of vertices. Another reasonable criterion is the use the total 
length of all segments in the tree. 

An unbalanced bidirectional search can instead by made by forcing the trees 
to be swapped in every iteration. Once the trees are swapped, then the roles are 
reversed. For example, after the first swap, T b is extended in the same way as an 
integration in Figure 5.20, and if a new vertex, q s , is added then an attempt is 
made to connect T a to q s . 

One important concern exists when a is deterministic. It might be possible 
that even through a is dense, when the samples are divided among the trees, each 
may not receive a dense set. If each uses its own deterministic sequence, then this 
problem can be avoided. In the case of making a bidirectional RRT planner, the 
same (pseudo) random sequence can be used without such troubles. 

More than two trees If a dual-tree approach offers advantages over a single 
tree, then it is natural to ask whether growing three or more RDTs might be 
even better. This is particularly helpful for problems like the double bug trap in 
Figure 5.16. New trees can be grown from parts of C that are difficult to reach. 
Controlling the number of trees and determining when to attempt connections 
between them is a difficult. Some promising experimental work has been done in 
this direction, but it currently requires substantial parameter tuning [62] . 

These additional trees could be started at arbitrary (possible random) configu- 
rations. As more trees are considered, a complicated decision problem arises. The 
computation time must be divided between attempting to explore the space and 
attempting to connect trees to each other. It is also not clear which connections 
should be attempted. Many research issues remain in the development of this and 
other RRT-based planners. A limiting case would be to start a new tree from 
every sample in a(i), and to try to connect nearby trees whenever possible. This 
approach leads to a graph that covers the space in a nice way that is independent 
of the query. This leads to the main topic of the next section. 

5.6 Roadmap Methods for Multiple Queries 

Previously, it was assumed that a single initial-goal pair was given to the planning 
algorithm. Suppose now that that numerous initial-goal queries will be given the 
algorithm, while keeping the robot model and obstacles fixed. This leads to a 
multiple- query version of the motion planning problem. In this case, it makes 
sense to invent substantial time to preprocess the models so that future queries 
can be answered efficiently. The goal is to construct a roadmap that can be used 
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BUILD JROADMAP 

1 G.initO; 

2 for i = 1 to N 

3 if a(i) G Cf ree then 

4 G.add_vertex(o;(i)); 

5 for each q G neighborhood (a 

6 if ((not G.same_component(a;(i), q)) and CONNECT (a(i), q)) then 

7 G.add_edge(a(i),q); 

Figure 5.31: The basic construction algorithm for sampling-based roadmaps. 

to efficiently solve queries. Intuitively, the paths on the roadmap will be easy 
to reach from each of and q g , and the network of paths in the roadmap can 
be quickly searched for a solution. The general framework presented here was 
mainly introduced in [387] under the name Probabilistic Roadmaps (PRMs). The 
probabilistic aspect, however, is not important to the method. Therefore, we call 
this family of methods sampling-based roadmaps. This distinguishes them from 
combinatorial roadmaps which will appear in Chapter 6. 

5.6.1 The Basic Method 

Once again, let G(V, E) represent a topological graph in which V is a set of 
vertices and E is the set of paths that map into C/ ree . Under the multiple-query 
philosophy, motion planning is divided into two phases of computation: 

Preprocessing Phase: During the preprocessing phase, substantial effort 
is invested to build G in a way that will be useful for quickly answering 
future queries. For this reason, it is called a roadmap, which in some sense 
should be capable of reaching every part of C/ ree . 

Query Phase: During the query phase, a pair, and q g , is given. Each 
configuration must be connected easily to G using a local planner. Following 
this, a discrete search is performed using any of the algorithms in Section 
2.3 to obtain a sequence of edges that forms a path from q^ to q g . 

Generic preprocessing phase Figure 5.31 presents an outline of the basic 
preprocessing phase. Figure 5.32 illustrates the algorithm. As seen throughout 
this chapter, the algorithm utilizes a uniform, dense sequence a. In each iteration, 
the algorithm must check whether q G C/ ree . If q G C Q b s , then it must continue to 
iterate until a collision-free sample is found. Once a(i) G C/ ree , then it is inserted 
into G, in Line 4. The next step is to try to connect a(i) to some nearby vertices, 
q, of G. Each connection is attempted by the connect function, which is a typical 
LPM (local planning method) from Section 5.4.1. In most implementations, this 
will simply test the shortest path between a(i) and q. Experimentally, it seems 
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Figure 5.32: The sampling-based roadmap is constructed incrementally by at- 
tempting to connect each new sample, a(i), to nearby vertices in the roadmap. 

most efficient to use the multiresolution, van der Corput-based method described 
at the end of Section 5.3.4 [272]. Instead of the shortest path, it is possible to 
use more sophisticated connection methods, such as the bidirectional algorithm 
in Figure 5.30. If the path is collision free, then connect returns true. 

The same .component condition in Line 6 checks to make sure a(i) and q are 
in different components of G before wasting time on collision checking. This will 
ensure that every time a connection is made, the number of connected components 
of G is decreased. This can be implemented very efficiently (near constant time) 
using the previously-mentioned union- find algorithm [176, 655]. In some imple- 
mentations this step may be ignored, especially if it is important to keep multiple 
solutions. For example, it may be desirable to generate solution paths from differ- 
ent homotopy classes. In this case the condition (not G. same .component (a (i), q)) 
may be replaced with with G.vertex_degree(g) < K, for some fixed K (e.g., K = 
15). 

Selecting neighboring samples Several possible implementations of Line 5 
can be made. In all of these, it seems best to sort the vertices that will be 
considered for connection in order in increasing distance from a{i). This makes 
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sense because shorter paths are usually less costly to check for collision, and 
they also have a high likelihood of being collision free. If a connection is made, 
this avoids costly collision checking of longer paths to configurations that would 
eventually belong to the same connected component. 

Several useful implementations of neighborhood are: 

1. Nearest K: The K closest points to a(i) are considered. This requires 
setting the parameter K. A typical value is 15. If you are unsure which 
implementation to use, try this one. 

2. Component K: Try to obtain up to K nearest samples from each connected 
component of G. A reasonable value is K = 1 in this case; otherwise, too 
many connections would be tried. 

3. Radius: Take all points within a ball of radius r, centered at a(i). An 
upper limit, K, may be set to prevent too many connections from being 
attempted. Typically, K = 20. A radius can be determined adaptively by 
shrinking the ball as the number of points increases. This reduction can 
be based on dispersion or discrepancy, if either of these is available for a. 
Note that if the samples are highly regular (e.g., a grid) then choosing the 
nearest K and taking points within a ball become essentially equivalent. 
If the point set is highly irregular, as in the case of random samples, then 
taking the nearest K seems preferable. 

4. Visibility: In Section 5.6.2, a variant will be described for which it is 
worthwhile to try connecting a to all vertices in G. 

Note that all of these require C to be a metric space. One variation that has not yet 
been given much attention is to ensure that the directions of the neighborhood 
points relative to a(i) are distributed uniformly. For example, if the 20 closest 
points are all clumped together in the same direction, then it may be preferable 
to try connecting to a further point because it is in the opposite direction. 

Query phase In the query phase, it is assumed that G is sufficiently complete 
to answer many queries, each of which gives an initial configuration, q iy and a 
goal configuration, q g . First, the query algorithm pretends as if qi and q g were 
chosen from a for connection to G. This requires running two more iterations 
of the algorithm in Figure 5.31. If q^ and q g are successfully connected to other 
vertices in G, then a search is performed for a path that connects the vertex q; L 
to the vertex q g . The path in the graph corresponds directly to a path in C/ ree , 
which is a solution to the query. Unfortunately, if this method fails, it cannot 
be determined conclusively whether a solution exists. If the dispersion is known 
for sample sequence, a, then it is at least possible to conclude that no solution 
exists for the resolution of the planner. In other words, if a solution does exist, it 
would require the path to travel through a corridor no wider than the radius of 
the largest empty ball [453]. 
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Figure 5.33: Examples such as these are difficult because of the narrow corridor 
that links two portions of Cf ree . 



Some analysis There have been many works that analyze the performance 
of sampling-based roadmaps. The basic idea from one of them [49] is briefly 
presented here. Consider problems such as those in Figure 5.33, in which the 
connect method will mostly likely fail, even though a connection exists. The 
higher-dimensional versions of these problems are even more difficult. Many plan- 
ning problems involve moving a robot through an area with tight clearance. This 
will generally cause narrow channels to form in C/ ree , which leads to a challenging 
planning problem for the sampling-based roadmap algorithm. Finding the escape 
of a bug trap is also challenging, but for the roadmap methods, even traveling 
through through a corridor is hard (unless more-sophisticated LPMs are used). 

Let V(q) denote the set of all configurations that can be connected to q using 
the connect method. Intuitively, this can be considered as the set of all config- 
urations that can be V(q) "seen" using line-of-sight visibility, as shown in Figure 
5.34.a 

The e-goodness of C/ ree is defined as 

e=min^pl), (5.39) 

in which fi represents the measure. Intuitively, e represents the small fraction of 
Cf ree that is visible from any point. In terms of e and the number of vertices in G, 
bounds can be established that yield the probability that a solution will be found 
[49]. The main difficulties are that the e-goodness concept is very conservative 
(it uses worst-case analysis over all configurations), and e-goodness is defined 
in terms of the structure of Cf ree , which cannot be computed efficiently. This 
result and other related results are interesting for gaining a better understanding 
of sampling-based planning, but such bounds are difficult to use in a particular 
application to determine whether an algorithm will perform well. 
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(a) Visibility definition (b) Visibility roadmap 



Figure 5.34: a) V(q) is the set of points reachable by the LPM from q. b) A 
visibility roadmap has two kinds of vertices: guards, which are shown in black, 
and connectors, shown in white. Guards are not allowed to see other guards. 
Connectors must see at least two guards. 

5.6.2 Visibility Roadmap 

One of the most interesting variations of sampling-based roadmaps is the visibility 
roadmap [705]. The approach works very hard to ensure that the roadmap repre- 
sentation is small, yet covers Cf ree well. The running time is often greater than 
the basic algorithm in Figure 5.31, but the extra expense is usually worthwhile if 
the multiple query philosophy is taken to its fullest extent. 
The idea is to define two different kinds of vertices in G: 

Guards: To become a guard, a vertex, q must not be able to see over guards. 
Thus, the visibility region, V(q), must be empty of guards. 

Connectors: To become a connector, a vertex, q, must see at least two 
guards. Thus, there exists guards q% and q2, such that q £ V(qi) fl v (qz). 

The roadmap construction phase proceeds similarly to the algorithm in Figure 
5.31. The neighborhood function returns all vertices in G. Therefore, for each new 
sample a(i), an attempt is made to connect it to every other vertex in G. 

The main novelty of the visibility roadmap is that a strong criterion exists to 
determine whether to keep a(i) and its associated edges in G. There are three 
possible cases for each a(i): 

1. The new sample, az(i), is not able to connect to any guards. In this case, 
a(i) earns the privilege of becoming a guard itself, and is inserted into G. 

2. The new sample can connect to guards from at least two different connected 
components of G. In this case, it becomes a connector, and is inserted into G 
along with its associated edges that connect it to these guards from different 
components. 
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3. Neither of the previous two conditions were satisfied. This means that the 
sample could only connect to guards in the same connected component. In 
this case, a(i) is discarded. 

Figure 5.35 shows the dramatic reduction in the number of vertices for two 
different examples. 14 Each column from top to bottom shows the problem, a basic 
sampling-based roadmap, and the visibility roadmap. The first example is for a 
point robot, and the second example is for a rectangular robot that can translate 
or rotate. 

One problem with the method described is that is does not allow guards to 
be deleted in favor of better guards that might appear later. The placement of 
guards depends strongly on the order in which samples appear in a. The method 
may perform poorly if guards are not positioned well early in the sequence. It 
would be better to have an adaptive scheme in which could allow guards to be 
reassigned in later iterations as better positions become available. Accomplishing 
this efficiently remains an open problem. 

5.6.3 Heuristics for Improving Roadmaps 

The quest to design a good roadmap though sampling has spawned many heuristic 
approaches to sampling and making connections in roadmaps. Most of these 
exploit properties specific to the shape of the configuration space and/or the 
particular geometry and kinematics of the robot and obstacles. The emphasis is 
usually on finding ways to dramatically reduce the number or required samples. 
Several of these methods are briefly described here. 

Original node enhancement [387] This heuristic strategy focuses effort on 
nodes that were difficult to connect to other nodes in the roadmap construction 
algorithm in Figure 5.31. A probability distribution, P(v), is defined over the 
vertices v e V. A number of iterations are then performed in which a vertex is 
sampled from V according to P(v), and then some random motions are performed 
from v to try to reach new configurations. These new configurations are added as 
vertices, and attempts are made to connect them to other vertices, as selected by 
the NEIGHBORHOOD function in an ordinary iteration of the algorithm in Figure 
5.31. A recommended heuristic [387] for defining P(v) is to define a statistic for 
each v as nf/(n t + 1), in which n t is the total number of connections attempted 
for v, and n/ is the number of times these attempts failed. The probability P(v) 
is assigned as nf/(n t + l)m, in which m is the sum of the statistics over all v e V 
(this serves to normalize the statistics to obtain a valid probability distribution). 

Sampling on the Cf ree boundary [13, 16] This scheme is based on the intu- 
ition that it is sometimes better to sample along the boundary, dCf ree , rather than 



14 These examples are taken from a class project of Andrew Olson and Kevin Crotty completed 
at Iowa State University. 




Figure 5.35: The visibility roadmap is more costly to construct, but can dramat- 
ically reduce the number of vertices. 
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Figure 5.36: To obtain samples along the boundary, binary search is used along 
random directions from a sample in C o6s . 

wasting samples on large areas of C/ ree that might be free of obstacles. Figure 5.36 
shows one way in which this can be implemented. For each sample of a(i) that 
falls into C Q b s , a number of random directions are chosen in C; these directions can 
be sampled using the S n method in Section 5.2.2. For each direction, a binary 
search is performed to get a sample in C/ ree that is as close as possible to C t, s . 
The order of point evaluation in the binary search is shown in Figure 5.36. Let 
r : [0, 1] denote the path, for which r(0) G C \, s and r(l) G Cf ree . In the first step, 
test the midpoint, r(l/2). If r(l/2) G C/ ree , this means that dCf ree lies between 
t(0) and r(l/2); otherwise, it lies between r(l/2) and r(l). The next iterations 
selects the midpoint of the path segment that contains dCf ree . This will be either 
t(1/4) or t(3/4). The process continuously recursively until the desired resolution 
is obtained. 

Gaussian sampling [86] The Gaussian sampling strategy follows some of the 
same motivation for sampling on the boundary. In this case, the goal is to obtain 
points near dCf ree by using a Gaussian distribution in which biases the samples to 



236 



S. M. LaValle: Planning Algorithms 




Figure 5.37: The bridge test finds narrow corridors by examining a triple of sam- 
ples. 




Figure 5.38: The medial axis is traced out by the centers of the largest inscribed 
balls. The five line segments inside of the rectangle correspond to the medial axis. 

be closer to dCf ree , but the bias is gentler, as prescribed by the variance parameter 
of the Gaussian. The samples are generated as follows. Generate one sample, 
qi G C, uniformly at random. Following this, generate another sample q 2 G C 
according to a Gaussian with mean qi, the distribution must be adapted for any 
topological identifications and/or boundaries of C. If one of q± or q 2 lies in C/ ree , 
and the other lies in C b. s , then the one that lies in C/ ree is used as a vertex in the 
roadmap. For some examples, this dramatically prunes the number of required 
vertices. 



Bridge test sampling [341] The Gaussian sampling strategy decides to keep 
a point based on part on testing a pair of samples. This idea can be carried 
one step further to obtain a bridge test, which uses three samples along a line 
segment. If the samples are arranged as shown in Figure 5.37, then the middle 
sample becomes a vertex. This is based on the intuition that narrow corridors are 
thin in at least one direction. The bridge test indicates that there a corridor is 
thin, while is a difficult and important place to locate a vertex. 
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Medial axis sampling [332, 490, 774] Rather than trying to sample close 
to the boundary, another strategy is to force the samples to be as far from the 
boundary as possible. Let (X, p) be a metric space. Let a maximal ball be a ball 
B(x,r) C X such that no other ball can be a proper subset. The centers of all 
maximal balls trace out a one-dimensional set of points referred to as the medial 
axis. A simple example of a medial axis is shown for a rectangular subset of M 2 
in Figure 5.38. The medial axis in C/ ree is based on the largest balls that can be 
inscribed in cl(Cf ree ). Sampling on the medial axis is generally difficult, especially 
because the representation of C/ ree is implicit. Distance information from collision 
checking can be used to start with a sample, a(i), and iteratively perturb it to 
increase its distance from dCf ree [490, 774]. Sampling on the medial axis of W \ O 
has also been proposed [332]. In this case, the medial axis in W \ O is easier to 
compute, and can be used to heuristically guide the placement of good roadmap 
vertices in C/ ree . 



Literature 

Explain [81] somewhere. 

Should say something about disconnection proofs. 

Need to cite the exact collision detection method of Latombe et al from WAFR 
2002. 

The following is from the section entitled "The Rise of Sampling- Based Motion 
Planning" in an ISSR 2003 paper coauthored with Steve Lindemann. It needs to 
be shortened here. 

To fully understand the continuing evolution of sampling-based motion plan- 
ning and its current issues, it is helpful to understand how sampling-based algo- 
rithms have developed and changed over time. In this section, we will describe 
how sampling-based algorithms began to emerge, and how they have continued 
to develop up to the present time. 

In the 1980s, constructing a representation of C Q b s , either completely or in part, 
was the predominate approach to motion planning. Examples include the planner 
by Brooks and Lozano-Perez for a polygon rotating and translating in the plane 
[102], work by Donald for planning for a 3D rigid body [205, 207], and a planner 
by Lozano-Perez for manipulator arms [505]. References to many combinatorial 
planners and a few early sampling-based ones can be found in Hwang and Ahuja's 
survey [357]. Glimpses of sampling-based motion planning began to emerge in 
the late 1980s. These algorithms typically centered around advances in efficient 
calculation of distance between polyhedra. Faverjon and Tournassoud introduced 
a manipulator planner which computed local collision-free motions using distance 
computation and hierarchical CAD models [243, 242]. The introduction of algo- 
rithms such as the Gilbert- Johnson-Keerthi algorithm [279] made sampling-based 
approaches more common. A good example of an approach is the manipulator 
planner of Paden et al. [602] . They create a 2 d -tree representation of the configu- 
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ration space, labelling cells as "freespace," "obstacle," or "not sure or mixed." To 
classify cells correctly (or at least, conservatively), they find the uniform bound on 
the Jacobian for the given manipulator. Then, based on this information and the 
workspace distance returned by the GJK algorithm, they can determine whether 
or not an entire cell can be classified as freespace or obstacle. If neither apply, 
then the cell is labelled mixed and will be subdivided, if a predefined minimum 
resolution has not yet been reached. After preprocessing the environment in such 
a way, it is simple to find a path, if one exists in the tree, or to determine that 
greater resolution is required to resolve small mixed cells. 

The use of distance information from a collision detector permits hierarchi- 
cal grid-based approaches as in Paden et al, but computing this information is 
more expensive than simply returning the boolean result of an intersection test 
(the most basic form of collision detection). A less-expensive grid-based approach 
might discretize the space at a sufficiently fine resolution and use an inexpensive 
collision detection method to determine whether each cell belongs to C/ ree , thus 
creating a bitmap of C-space. The resulting data structure can then be searched 
by classical AI search techniques to find a path, if one exists. In fact, this very ap- 
proach was taken by Lengyel et al. [479]. Their algorithm uses graphics hardware 
to plan for a polygonal robot translating and rotating in the plane. They divide the 
rotational degree of freedom, 9, into a number of slices, and use graphics hardware 
to calculate the Minkowski sum of the robot and obstacles for a particular value 
of 6. They combine all resulting slices and have a bitmap representation of the 
three-dimensional C-space, which they then search with a dynamic programming 
technique. 

In general, however, this kind of approach is limited to lower dimensions since 
the number of resultant grid cells grows exponentially with the number of DOFs 
of the problem, and the a fine resolution is required. Hence, checking them all for 
collision becomes impractical. Nevertheless, when general sampling-based motion 
planning algorithms began to proliferate in the early 90's, several of these were 
clearly influenced by the grid search approach. We will consider two of this type, 
along with two other early sampling-based algorithms, before describing several 
more recent, state-of-the-art sampling-based motion planners. 

One early planner that strongly reflects classical grid search techniques is that 
of Kondo [406]. Kondo's planner is based on the observation that even if a fine 
grid is placed over the configuration space, it may be possible to find a solution 
without visiting large portions of that grid. Hence, if one delays collision checking 
until needed-a "lazy" approach-only (relatively) few collision checks will need to 
be performed, thus avoiding the expensive preprocessing step of naive grid search. 
The planner searches a grid bidirectionally, assigning cost f(C) = g(C) + h(C) to 
each expanded grid cell, in which g(C) is the standard cost-to-come and h(C) is 
a heuristic weighted sum-of-squares cost. Kondo's planner uses multiple heuris- 
tics (i.e., different assignments of the heuristic weight constants), and adaptively 
selects between them based on an estimate of their effectiveness. Hence, the effec- 
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tiveness of the planner strongly depends on the quality of the heuristic functions, 
and on the planner's ability to choose the appropriate one to apply. If either 
of these are poor, then performance will degrade greatly. Kondo gives several 
six-dimensional examples, with the resolution of the grid being 2 7 points per axis 
yielding 2 42 total grid cells. However, for the results reported, typically less than 
20000 collision checks were needed to solve the problem. The influence of Kondo's 
multiple-heuristic approach can be seen in recent PRM-related work by Isto [362]. 

In 1990, Barraquand and Latombe introduced the planner that came to be 
called the Randomized Path Planner [51]. This planner is important for three 
primary reasons: first, it was the perhaps the first well-known sampling-based 
motion planner; second, it solved problems with many DOFs, typically many 
more than other planners at the time were capable of handling; and third, it 
advocated randomization as a means of efficiently finding solutions in the high- 
dimensional configuration space. Its influence in this third respect can hardly be 
overestimated, since for the following decade virtually every significant sampling- 
based motion planning algorithm used randomization. In fact, only recently has 
the role of randomization in sampling-based motion planning begun to be studied 
in depth. We will discuss this issue in some depth in subsequent sections. RPP 
operates as follows: first, the planner defines several potential fields over a grid 
imposed on the workspace; each potential field corresponds to a "control point" on 
the robot. A finer-resolution grid is also defined over the configuration space, and 
the potential value of each configuration-space grid cell is defined by the following 
non-negative, real- valued function on C/ ree : 

U(q) = G (U P1 (X( Pl , q)), . . . , U Pn (X(p n , q))) , 

in which pi, . . . ,p n are the control points, X is a function mapping a point on the 
the robot to its position in the workspace at the given configuration, and G is an 
arbitration function. Then, beginning at the initial state, the planner descends 
the gradient of the C-space potential field, until a local minimum is reached. If 
the minimum is the global minimum, the goal state has been attained; else, the 
planner executes a series of random walks with the aim of escaping the local 
minimum. After this, the planner again descends the potential field gradient, 
continuing this process until the goal state has been reached or a user-specified 
amount of time has elapsed. This latter condition is necessary because unlike 
combinatorial planners, sampling-based planners are typically unable to recognize 
that a problem has no solution; in such a situation, they will never terminate. The 
key to this planner's performance is the construction of good potential fields and a 
good arbitration function, which can be quite difficult to construct in practice. If 
the potential fields result in many local minima, the planner can perform poorly. 

Another early sampling-based motion planner is the SANDROS planner of 
Chen and Hwang [140], which was developed for manipulator arms. This planner 
searches in a multi-resolution manner over a non-uniform grid (i.e., the resolution 
on the coordinate axes may differ). The axes are given different resolutions because 
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for manipulator arms, links near the base have the greatest impact on end effector 
position. Just as Paden et al., this algorithm uses the GJK algorithm [279] for 
collision detection. It also uses the distance information to place links of the arm 
at maximal distance from the workspace obstacles. 

Finally, a planner (later termed the ZZ-method) was introduced by Glavina 
in 1990 [281] which foreshadows PRMs in many respects. The ZZ-method first 
attempts to connect the initial and goal queries using a "straight-and-slide" local 
planner (a method which does not allow backtracking but is more powerful than 
the straight-line local planner). If this fails, which is usually the case, then a new 
configuration is chosen as a subgoal (Glavina advocates using jittered sampling), 
and attempts to connect the subgoal to the initial and goal configurations using 
the same local planner. If this fails, new subgoals are added and attempts are 
made to connect them with previously existing subgoals, as well as the initial and 
goal configurations. Edges between subgoals are checked for collisions at a pre- 
defined subsampling resolution. Glavina also identifies the well-known "narrow 
corridor" problem and uses connected component analysis to speed up his planner. 
However, he uses a primitive collision detection method which prevents him from 
applying his algorithm to challenging, high-DOF problems (this was remedied in 
some extensions of his work [38, 39]); also, the straight-and-slide local planner 
becomes expensive in high dimensions. In principle, however, the ZZ-method 
contains many elements which have become common in more recent algorithms. 

Since the introduction of these early algorithms, sampling-based motion plan- 
ning has continued to develop. Changes have been made to deal with failings of 
previous planners, and new exploration paradigms have been investigated. We dis- 
cuss four well-known recent motion planning algorithms: PRMs, Ariadne's Clew, 
the expansive-space planner by Hsu et al, and RRTs. 

In recent years, the most popular paradigm for sampling-based motion plan- 
ning has the probabilistic roadmap [387]. The original PRM, along with its nu- 
merous extensions and variants (e.g., [13, 81, 482, 629, 705, 774, 780]), have been 
successfully applied to problems in robotics, computer animation, and computa- 
tional biology [404, 627, 715]. While there is a strong connection to Glavina's work, 
there are several important differences. Foremost among these is that the PRM 
is designed for multiple-queries rather than a single-query. Hence, the placement 
of landmarks is seen as constructing a reusable roadmap in the PRM method, 
not as generating query subgoals as in the ZZ-method. Second, the ZZ-method 
attempted to connect each new landmark (subgoal) to all previous ones; PRMs at- 
tempt to connect to a more carefully-chosen subset of these, which is typically the 
K nearest landmarks from each connected component, or all subgoals within some 
specified radius. Third, the PRM uses a more simple local planner, often either 
straight-line or rotate-at-s [18], unlike the ZZ-method's more expensive straight- 
and-slide local planner. Finally, methods are used to identify difficult regions of 
C-space and sample in those regions (the "roadmap enhancement" phase). Along 
with the use of more sophisticated collision detection methods, these factors make 
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the PRM more effective for challenging motion planning problems. 

Ariadne's Clew is a single-query algorithm that grows a tree from the initial 
configuration toward the goal configuration [543, 544]. At each step, it searches 
for a new "landmark," reachable from a current landmark by a Manhattan path 
of a certain order, which is maximally distant from the set of all current land- 
marks. They use highly-parellelized genetic algorithms to search for a solution 
to this optimization problem. Once a new landmark has been added to the tree, 
the planner attempts to connect this new landmark to the goal. To improve per- 
formance, when the algorithm encounters an obstacle in trajectory calculation it 
"bounces" off it. Experimental results give fast solution times for motion of a 
6-DOF arm in a dynamic environment. One limitation, however, is the difficult 
heuristic choices required for the genetic algorithm. 

Hsu et al. introduced a single-query path planner 15 for "expansive" configura- 
tion spaces in [344]. The notion of expansiveness is related to how much of the free 
space is visible from a single free configuration or connected set of free configura- 
tions, and extends the idea of e-goodness [49]. The expansive-space planner grows 
a tree from the initial configuration. Each node x in the tree has an associated 
weight, which is defined to be the number of nodes inside Nd(x), the ball of radius 
d centered at x. At each iteration, it picks a node to extend; the probability that 
a given node x will be selected is l/w(x), in which w is the weight function. Then, 
K points are sampled from Nd(x) for the selected node x, and the weight function 
value for each is calculated. Each new point y is retained with probability l/w(y), 
and the planner attempts to connect each retained point to the node x. Hence, 
we see a similarity between this planner and Ariadne's clew, in that they each try 
to "push" the tree toward unexplored areas of free space. The main drawback 
of the approach is that the required d and K parameters may vary dramatically 
across problems, and they are difficult to estimate for a given problem. 

Finally, we describe Rapidly-exploring Random Trees (RRTs) [449, 466], which 
were developed for problems with differential constraints, such as kinodynamic 
planning and nonholonomic planning. Its introduction has stimulated a flurry of 
recent applications and extensions (e.g., [94, 109, 145, 164, 192, 261, 371, 375, 395, 
486, 756, 780]). In its basic form, the RRT attempts to grow a tree from the initial 
configuration to the goal configuration as follows: take a random sample, and find 
its nearest neighbor in the search tree. Then, grow toward the sample from its 
nearest neighbor. This process is repeated until the initial and goal configurations 
are connected. The best-performing RRT planner uses a more greedy connection 
strategy (at each iteration, attempt to make a complete connection from the 
nearest neighbor to the sample) and searches bidirectionally. This planner rapidly 
explores the configuration space because it is Voronoi-biased: at each iteration it 
tends to grow from the node with the largest Voronoi area. This is because the 

Some authors refer to this and virtually all planning algorithms that use randomization 
as PRMs. To avoid confusion, we do not use this term for single-query planners, such as the 
planner of Hsu et al, even though it is called a PRM by its authors. 
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probability that a node is selected for expansion is directly proportional to the 
volume of its Voronoi cell. In contrast to Ariadne's Clew and the expansive-space 
planner, which work hard to push the tree toward unexplored regions, RRTs are 
pulled into these regions by virtue of the sampling and connection strategy. This 
avoids the need for complicated parameter tuning, but comes at the expense of 
performing nearest neighbor queries. 

Hierarchical collision detection is covered in [556, 493, 293]. The incremental 
collision detection ideas are borrowed from the Lin-Canny algorithm [492] and 
V-Clip [556]. [639, 293, 556, 300, 221] Survey: [493] 

Nearest Neighbors: 

[28, 27, 263, 401, 599, 721, 359, 788]. 
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Exercises 

There are merely sketches of ideas here. Needs to be updated... 

1. Show that using uniform mass over § 3 yields the Haar measure for 5*0(3) . 

2. Show some unidimensional dispersion bounds. 

3. Construct a bound on distance traveled by points on A when a quaternion 
is perturbed. 

4. Make up some bug trap examples with real geometry. 

5. (Open problem) Prove there are d + 1 branches for an RRT in an "infinite" 
disc. 

6. Do something with Cantor sets. 

7. Devise a good way to select a subset of neighbors in a high-dim grid. 

8. Something with average-case dispersion. 

9. Try RRTs with more-powerful descent functions 
10. Experiment with visibility pruning in the RRT. 
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Chapter 6 

Combinatorial Motion Planning 



Chapter Status 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



Combinatorial approaches to motion planning find paths through the con- 
tinuous configuration space without resorting to approximations. Because of this 
property, they are alternatively referred to as exact algorithms. This is in contrast 
to the sampling-based motion planning algorithms from Chapter 5. 

6.1 Introduction 

All of the algorithms presented in this chapter are complete, which means that 
for any problem instance (over the space of problems for which the algorithm is 
designed), the algorithm will either find a solution, or will correctly report that no 
solution exists. By constrast, in the case of sampling-based planning algorithms, 
weaker notions of completeness were tolerated: resolution completeness, dispersion 
completeness, and probabilistic completeness. 

Representation is important When studying combinatorial motion planning 
algorithms, it is important to carefully consider the definition of the input. What 
is the representation used for the robot and obstacles? What set of transforma- 
tions may be applied to the robot? What is the dimension of the world? Are 
the robot and obstacles convex? Are they piecewise linear? The specification of 
possible inputs defines a set of problem instances on which the algorithm will op- 
erate. If the instances have certain convenient properties (e.g., low dimensionality, 
convex models), then a combinatorial algorithm may provide an elegant, practical 
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solution. If the set of instances is too broad, then a requirement of both complete- 
ness and practical solutions may be unreasonable. Many general formulations of 
general motion planning problems are PSPACE-hard 1 ; therefore, such a hope ap- 
pears unattainable. Nevertheless, there exist general, complete motion planning 
algorithms. Note that focusing on the representation is the opposite philosophy 
from sampling-based planning, which hides these issues in the collision detection 
module. 

Reasons to study combinatorial methods Based on these observations, 
there are generally two good reasons to study combinatorial approaches to motion 
planning: 

1. In many applications, one may only be interested in a special class of plan- 
ning problems. For example, the world might be two-dimensional, and the 
robot might only be capable of translation. For many special classes, elegant 
and efficient algorithms can be developed. These algorithm are complete, do 
no depend on approximation, and can offer much better performance than 
incomplete methods, such as those in Chapter 5. 

2. It is both interesting and satisfying to know that there are complete algo- 
rithms for an extremely broad class of motion planning problems. Thus, 
even if the class of interest does not have some special limiting assumptions, 
there still exist general-purpose tools and algorithms that can solve it. These 
algorithms also provide theoretical upper bounds on the time needed to solve 
motion planning problems. 

Warning: some methods are impractical Be careful about making the 
wrong assumptions when studying the algorithms of this chapter. A few of them 
are efficient and easy to implement, but many might be neither. Even if an algo- 
rithm has an amazing asymptotic running time, it might be close to impossible to 
implement. For example, one of the most famous algorithms from computational 
geometry can split a simple 2 polygon into triangles in 0(n) time for a polygon 
with n edges [137]. This is so amazing that it was covered in the New York 
Times, but the algorithm is so complicated that it is doubtful that anyone will 
ever implement it. Sometimes it is preferable to use an algorithm that has worse 
theoretical running time, but is much easier to understand and implement. In 
general, though, it is valuable to understand both kinds of methods and decide 
on the tradeoffs for yourself. It is also an interesting intellectual pursuit to try 
to determine how efficiently a problem can be solved, even if the result is mainly 
of theoretical interest. This might motivate others to look for simpler algorithms 
that have the same or similar asymptotic running times. 

lr This implies NP-hard. An overview of such complexity statements appears in Section 6.5.1. 
2 A polygonal region that has no holes. 
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Roadmaps Virtually all combinatorial motion planning approaches construct a 
roadmap along the way to solving queries. This notion was introduced in Section 
5.6, but in Chapter 6 stricter requirements are imposed in the roadmap definition 
because any algorithm that constructs one needs to be complete. Some of the 
algorithms in this chapter will first construct a cell decomposition of C/ ree from 
which the roadmap is consequently derived. Other methods directly construct a 
roadmap without consideration of cells. 

Let G be a topological graph (defined in Example 4.1.6), that maps into C/ ree . 
Furthermore, let S C C/ ree be the set of all points reached by G, as defined in 
(5.38). The graph G is called a roadmap if it satisfies two important conditions: 

Accessibility: From any q G C/ ree , it is simple and efficient to compute a 
path r : [0, 1] — > Cf ree such that r(0) = q and r(l) = s, in which s may be 
any point in S. Usually, s is the closest point to q, assuming C is a metric 
space. 

Connectivity Preserving: Using the first condition, it will be possible to 
connect some and q g to some S\ and s 2 , respectively, in S. The second 
condition requires that if there exists a path r : [0, 1] — > C/ ree such that 
r(0) = qi and r(l) = q g , then there also exists a path r' : [0, 1] — > S, such 
that t'(0) = Si and r'(l) = S2- Thus, solutions will not be missed because 
the G fails to capture the connectivity of Cf ree . This ensures that complete 
algorithms will be developed. 

By satisfying these properties, a roadmap provides a discrete representation 
of the continuous motion planning problem without losing any of the original 
connectivity information needed to solve it. A query, qi and q g is solved using G 
by connecting each query point to the roadmap, which relies on accessibility, and 
then performing a discrete graph search on G, which relies on G being connectivity 
preserving to ensure that a solution will be found when one exists. 

6.2 Polygonal Obstacle Regions 

Rather than diving into the most general forms of combinatorial motion plan- 
ning, it is helpful to first see several methods explained for a case that is easy to 
visualize. Several elegant, straightforward algorithms exist for the case in which 
C = M 2 and C b s is polygonal. Most of these cannot be directly extended to higher 
dimensions; however, some of the general principles remain the same. Therefore, 
it is very instructive to see how combinatorial motion planning approaches work in 
two dimensions. There are also applications where these algorithms may directly 
apply. One example is planning for a small mobile robot which may be modeled 
as a point moving on a building floor that can be modeled with a 2D polygonal 
floor plan. 
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Figure 6.1: A polygonal model specified by four oriented simple polygons. 



After covering representations in Section 6.2.1, Sections 6.2.2-6.2.4 present 
three different algorithms to solve the same problem. The one in Section 6.2.2 
first performs cell decomposition on the way to building the roadmap, and the ones 
in Sections 6.2.3 and 6.2.4 directly produce a roadmap. The algorithm in Section 
6.2.3 computes maximum clearance paths, and one in Section 6.2.4 computes 
shortest paths (which consequently have no clearance). 

6.2.1 Representation 

Assume that W = M 2 , the obstacles, O, are polygonal, and the robot, A, is a 
polygonal body that is only capable of translation. Under these assumptions, C Q b s 
will be polygonal. For the special case in which A is a point in W, O maps directly 
to C b s without any distortion. Thus, the problems considered in this section may 
also be considered as planning for a point robot. If A is not a point robot, then 
the Minkowski difference, (4.43), of O and A must be computed. For the case 
in which both A and each component of O are convex, the algorithm in Section 

4.3.2 can be applied to compute each component of C Q b s . In general, both A and 
O may be nonconvex. They may even contain holes, which results in a C Q b s model 
such as that shown in Figure 6.1. In this case, A and O may be decomposed 
into convex components, and the Minkowski difference can be computed for each 
pair of components. The decompositions into convex components can actually be 
performed by adapting the cell decompisition algorithm that will be presented in 



6.2. POLYGONAL OBSTACLE REGIONS 



249 



Section 6.2.2. Once the Minkowski differences have been computed, they need 
to be merged to obtain a representation that can be specified in terms of simple 
polygons, as in Figure 6.1. An efficient algorithm to perform this merging is given 
in Section 2.4 of [189]. It can also be based on many of the same principles as the 
planning algorithm in Section 6.2.2. 

To implement the algorithms described in this section, it will be helpful to 
have a data structure that allows convenient access to the information contained 
in a model such as Figure 6.1. How is the outer boundary represented? How are 
holes inside of obstacles represented? How do we know which holes are inside 
of which obstacles? These questions can be efficiently answered by using the 
doubly-connected edge list data structure, which was described in Section 3.1.3 
for consistent labeling of polyhedral faces. We will need to represent models such 
as 6.1, and any other information that planning algorithms would like to maintain 
during execution. There are three different records: 

Vertices: Every vertex, v, contains a pointer to a point (x, y) G C = M 2 , 
and a pointer to some half-edge that has v as its origin. 

Faces: Every face has one pointer to a half-edge on the boundary that 
surrounds the face; the pointer value is nil if the face is the outermost 
boundary. The face also contains a list of pointers for each component (e.g., 
a hole) that is contained inside of that face. Each pointer in the list points 
to a half edge of the component's boundary. 

Half-edges: Each half-edge is directed so that the obstacle portion is always 
to its left. It contains five different pointers. There is a pointer to its origin 
vertex. There is a twin half-edge pointer, which may point to a half-edge that 
runs in the opposite direction (see Section 3.1.3). If the half-edge borders 
an obstacle, then this pointer is nil. Each half-edge also contains pointers 
to the next and previous half edges in the circular chain. Such chains are 
oriented so that the obstacle portion (or a twin half-edge) is always to its 
left. Half-edges are always arranged in circular chains to form the boundary 
of a face. The half-edge must also store a pointer to this face. 

For the example in Figure 6.1, there are four circular chains of half-edges, which 
each bound a different face. The face record of the small triangular hole points to 
the obstacle face that contains the hole. Each obstacle contains a pointer to the 
face represented by the outermost boundary. Note that by consistently assigning 
orientations to the half edges, circular chains that bound an obstacle always run 
counterclockwise, and chains that bound holes run clockwise. There are no twin 
half-edges because all half-edges bound part of C b s . The doubly-connected edge 
list data structure is general enough to allow extra edges to be inserted that slice 
through Cf ree . These edges will not be on the border of C Q b s , but they can be 
managed using twin half edge pointers. This will be useful for the algorithm in 
Section 6.2.2. 
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Figure 6.2: There are four general cases: 1) extending upward and downward, 2) 
upward only, 3) downward only, and 4) no possible extension. 

6.2.2 Vertical Cell Decomposition 

Cell decompositions will be defined formally (as cell complexes) in Section 6.3, 
but here the notion will be used informally. Combinatorial methods must con- 
struct a finite data structure that exactly represents the planning problem. Cell 
decomposition algorithms achieve this partitioning C/ ree into a finite set of regions 
called cells. The term k-cell will be used to refer to a A;-dimensional cell. The cell 
decomposition should satisfy three properties: 

1. Computing a path from one point to another inside of a cell must be trivially 
easy. For example, if every cell is convex, then any pair of points in a cell 
can be connected by a line segment. 

2. Adjacency information for the cells cna be easily extracted to build the 



3. For a given qi and q g , it should be efficient to determine which cells contain 



If a cell decomposition satisfies these properties, then the motion planning problem 
is reduced to a graph search problem. Once again the algorithms of Section 2.3 
may be applied; however, in the current setting, the entire graph, G, is usually 
known in advance. 3 This was not assumed for discrete planning problems. 

Defining the vertical decomposition An algorithm will next be presented 
that constructs a vertical cell decomposition [136], which partitions Cf ree into a fi- 
nite collection of 2-cells and 1-cells. Each 2-cell will be either a trapezoid that has 
vertical sides, or it will be a triangle (which is a degenerate trapezoid). For this 
reason, the method is sometimes called trapezoidal decomposition. The decompo- 
sition is defined as follows. Let P denote the set of vertices used to define C b s - 
At every p e P, try to extend rays upward and downward through C/ ree , until 

3 Once exception to this are the algorithms mentioned in Section 6.5.3, which obtain greater 
efficiency by only maintaining one connected component of C b s - 



roadmap. 



them. 



6.2. POLYGONAL OBSTACLE REGIONS 



251 



the boundary of C a b s is hit. There are four possible shown in Figure 6.1, 

depending on whether or not it is possible to extend in each of the two directions. 
If Cf ree is partitioned according to these rays, then a vertical decomposition will 
result. Extending these rays for the example in Figure 6. 3. a leads to the decom- 
position of Cf ree shown in Figure 6.3.b. Note that only trapezoids and triangles 
are obtained for the 2-cells in C/ ree . 




Figure 6.3: The vertical cell decomposition method uses the cells to construct a 
roadmap, which is searched to yield a solution to a query. 

Every 1-cell is a vertical segment that serves as the border between two 2-cells. 
We must ensure that topology is correctly handled. Recall that C/ ree was defined 
to be an open set. Every 2-cell is actually defined to be an open set in IR 2 ; thus, it 
is the interior of a trapezoid or triangle. The 1-cells are the interiors of segments. 
It is tempting to make 0-cells, which correspond to the endpoints of segments, but 
these will not be allowed because they lie in C ofes . 

General position issues What if two points along C b s lie on a vertical line 
that slices through Cf ree 7 What happens when one of the edges of C b s is ver- 
tical? These are special cases that have been ignored so far. Throughout much 
of combinatorial motion planning it is common to ignore such special cases and 
assume C Q b s is in general position. This means that if all of the data points are 
perturbed by a small amount in some random direction, the probability that the 
special case remains is zero. Since a vertical edge is no longer vertical after being 
slightly perturbed, it is not considered as part of general position. The general 
position assumption is usually made because it greatly simplifies the presentation 
of an algorithm (and in some cases, its asymptotic running time is even lower). In 
practice, however, this assumption can be very frustrating. Most of the implemen- 
tation time is often devoted to correctly handling such special cases. Performing 
random perturbations may avoid this problem, but it tends to unnecessarily com- 
plicate the solutions. For the vertical decomposition, the problems are not too 
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difficult to handle without resorting to perturbations (this is requested in Exercise 
1); however, in general, it is important to be aware of this difficulty, which is not 
as easy to fix in most other settings. 




Figure 6.4: The roadmap derived from the vertical cell decomposition. 




Figure 6.5: An example solution path. 



Defining the roadmap To enable the handling of motion planning queries, a 
roadmap is constructed from the vertical cell decomposition. For each cell, Cj, 
let qi denote a designated sample point such that qi G C«. The sample points 
can be selected as the cell centroids, but the choice is not too important. Let 
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G(V, E) be a topological graph defined as follows. For every cell, define a vertex 
v e V that corresponds to its sample point. There is a vertex for every 1-cell and 
every 2-cell. For each 2-cell, define an e G E from its sample point to the sample 
point of every 1-cell that lies along its boundary. Each edge is a line-segment 
path between the sample points of the cells. The resulting graph is a roadmap, 
as depicted in Figure 6.4. The accessibility condition is satisfied because every 
sample point can be reached by a straight-line path thanks to the convexity of 
every cell. The connectivity condition is satisfied because G is derived directly 
from the cell decomposition, which also preserved the connectivity of C/ ree . Once 
the roadmap is constructed, the cell information is no longer needed for answering 
planning queries. 

Solving a query Once the roadmap is obtained, it is straightforward to solve a 
motion planning query, q i and q g . Let C and Cf denote the cells that contain q^ 
and q g , respectively. In the graph, G, search for a path that connects the sample 
point of Cq to the sample point of Cf. If no such path exists, then the planning 
algorithm correctly declares that no solution exists. If one does exist, then let C±, 
C 2 , . . ., C k -i denote the sequence of 1-cells and 2-cells visited along the computed 
path in G from C to GV 

A solution path can be formed by simply "connecting the dots". Let qo, q±, q^ 
. . ., qk-i, qk, denote the sample points along the path in G. There is one sample 
point for every cell that is crossed. The solution path, r : [0, 1] — > C/ ree , is formed 
by setting r(0) = q iy r(l) = q g , and visiting each of the points in the sequence 
from q to qk by traveling along the shortest path. For the example, this leads to 
the solution shown in Figure 6.5. In selecting the sample points, it was important 
to ensure that each path segment from the sample point of one cell to the sample 
point of its neighboring cell is collision free. 4 

Computing the decomposition The problem of efficiently computing the de- 
composition has not yet been considered. Without concern for efficiency, the 
problem appears simple enough that all of the required steps can be computed by 
brute force computations. If C b s has n vertices, then this approach would take at 
least 0(n 2 ) time because intersection tests have to be made between each vertical 
ray and each segment. This even ignores the data structure issues involved finding 
the cells that contain the query points, and in building the roadmap that holds 
the connectivity information. By careful organization of the computation, it turns 
out that all of this can be nicely handled, and the resulting running time is only 
0(n\gn). 



4 This is the reason why the approach is defined in differently from Chapter 1 of [437]. In 
that case, sample points were not placed in the interiors of the 2-cells, and collision could result 
for some queries. 
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Plane sweep principle The algorithm is based on the plane sweep or line sweep 
principle from computational geometry [83, 189, 219], which forms the basis of 
many combinatorial motion planning algorithms and many other algorithms in 
general. Much of computational geometry can be considered as the development 
of data structures and algorithms that generalize the sorting problem to multiple 
dimensions. In other words, it deals with carefully "sorting" geometric informa- 
tion. 

The word "sweep" is used to refer to these algorithms because it can be imag- 
ined that a line (or plane, etc.) sweeps across the space, only to stop in places 
where some critical changes occur in the information. This gives the intuition, 
but the sweeping line is not explicitly represented by the algorithm. To construct 
the vertical decomposition, we imagine that a vertical line sweeps from x = — oo 
to x = oo, using (x, y) to denote a point in W = R 2 . 

From Section 6.2.1, note that the set P of C obs vertices is the only data in 
R 2 that is explicitly referenced. It therefore seems reasonable that interesting 
things can only occur at these points. Sort the points in P in increasing order 
by their X coordinate. Assuming general position, no two points will have the 
same X coordinate. The points in P will now be visited in order of increasing x 
value. Each visit to a point will be referred to as an event. Before, after, and in 
between every event, a list, L, of C Q b s some edges will be maintained. This list 
must be maintained at all times in the order that the edges appear if stabbed by 
the vertical sweep line. The ordering is maintained from lower to higher. 

Algorithm execution Figure 6.6 and Table 6.1 show how the algorithm pro- 
ceeds. Initially, L is empty and a double-connected edge list is used to represent 
Cf ree . Each connected component of C/ ree will be a single face in the data structure. 
Suppose inductively that after several events occur, L is correctly maintained. For 
each event, one of the four cases in Figure 6.2 occurs. By maintaining L in a bal- 
anced binary search [176], the edges above and below p can be determined in 
O(lgn) time. This is much better than 0(n) time, which would arise from check- 
ing every segment. Depending on which of the four cases from Figure 6.2 occurs, 
different updates are made. If the first case occurs, then two different edges are 
inserted, and the face of which p is on the border is split two times by vertical line 
segments. For each of the two vertical line segments, two half edges are added, 
and all faces and half-edges must be updated correctly (this operation is local in 
that only records adjacent to where the change occurs need to be updated). The 
next two cases in Figure 6.2 are simpler; only a single face split is made. For the 
final case, no splitting occurs. 

Once the face splitting operations have been performed, L needs to be updated. 
When the sweep line crosses p, two edges are always affected. For example, in 
the first or last cases of Figure 6.2, two edges must be inserted into L (the mirror 
images of these cases will cause two edges to be deleted from L). If the middle two 
cases occur, then one edge is replaced another in L. These insertion and deletion 
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Figure 6.6: There are 14 events in this example. 



Event Sorted Edges in L 






{a,b} 


1 


{d,b} 


2 


{d, f,e,b} 


3 


{d, f,i,b} 


4 


{d, f,g,h,i,b} 


5 


{d, f,g,j,n, h,i,b} 


6 


{d, f,g,j,n,b} 


7 


{d,j,n,b} 


8 


{d,j,n,m,l,b} 


9 


{d,j,l,b} 


10 


{d, k, I, b} 


11 


{d, b} 


12 


{d, c} 


13 


{} 



Table 6.1: The status of L is shown after each of 14 events occurs 
first event, L is empty. 



Before the 
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Figure 6.7: The maximum clearance roadmap keeps as far away from the C Q b s as 
possible. This involves traveling along points that are equidistant from two or 
more points on the boundary of C Q b s - 

V 4 




(a) (b) (c) 



Figure 6.8: Voronoi roadmap pieces are generated in one of three possible cases: 
a) between two edges, b) between two points, and c) between a point and an edge. 
The third case leads to a quadratic curve. 

operations can be performed in O(lgn) time, assuming L is implemented using a 
balanced binary search tree. Since there are n events, the running time for the 
construction algorithm is 0(n\gn). 

The roadmap, G, can be computed from the face pointers of the doubly- 
connected edge list. A more elegant approach is to incrementally build G in each 
event. In fact, all of the pointer maintenance required to obtain a consistent 
doubly-connected edge list can be ignored if desired, as long as G is correctly 
built, and the sample point is obtained for each cell along the way. We can even 
go one step further, and forget about the cell decomposition, and instead build a 
topological graph line segment paths between all sample points of adjacent cells. 

6.2.3 Maximum Clearance Roadmaps 

This method directly produces a roadmap without the consideration of cells. A 
maximum clearance roadmap tries to keep as far as possible from C f, s , as shown 
for the corridor in Figure 6.7. The resulting solution paths are sometimes pre- 
ferred in mobile robotics applications because it is difficult to measure and control 
the precise position of a mobile robot. Traveling along the maximum clearance 
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roadmap reduces the chances of collisions due to these uncertainties. Other names 
for this roadmap are generalized Voronoi diagram and retraction method [590]. It 
is considered as a generalization of Voronoi diagrams, which were considered in 
Section 5.2.2, from the case of points to the case of polygons. Each point along 
a roadmap edge is equidistant from two edges of C b s - Each roadmap vertex cor- 
responds to the intersection of two or more roadmap map edges, and is therefore 
equidistant from three or more edges of C b s - 

The retraction term comes from topology, and provides a nice intuition about 
the method. A subspace S is a deformation retract of a topological space X if the 
following continuous homotopy, h : X x [0, 1] — > X can be defined as follows [328]: 

1. h(x, 0) = x for all x G X . 

2. h(x, 1) is a continuous function that maps every element of X to some ele- 
ment of S. 

3. For all t G [0, 1], h(s, t) = s for any s G S. 

The intuition is that C/ ree is gradually thinned through the homotopy process, 
until a skeleton, S, is obtained. An approximation to this shrinking process can 
be imagined by shaving off a thin layer around the whole boundary of C/ ree . If 
this is repeatedly iteratively, the maximum clearance roadmap is the only part 
that will remain (assuming we prevent the remaining slivers from being shaved 
away) . 

To construct the maximum clearance roadmap, the concept of features from 
Section 5.3.3 will be used again. Let the feature set refer to the set of all edges 
and vertices of C b. s . Candidate paths for the roadmap are produced by every pair 
of features. This leads to a naive 0(n A ) time algorithm as follows. For every edge- 
edge feature pair, generate a line as shown in Figure 6. 8. a. For every vertex-vertex 
pair, generate a line as shown in Figure 6.8.b. Finally, for every edge-point pair, 
generate a parabolic curve as shown in Figure ??. (The maximum clearance path 
between a point and a line is a parabola.) The portions of the paths that actually 
lie on the maximum clearance roadmap are determined by intersecting the curves. 
Several algorithms exist that provide better asymptotic running time [474, 481], 
but they are considerably more difficult to implement. The best-known algorithm 
runs in O(nlgn) time in which n is the number of roamdap curves [686]. 

6.2.4 Shortest Path Roadmaps 

Instead of generating paths that maximize clearance, suppose that the goal is 
to find shortest paths. This leads to the shortest path roadmap, which is also 
called the reduced visibility graph in [437]. The idea was first introduced in [582] 
and may perhaps be the first example of a motion planning algoritm. This is in 
direct conflict with maximum clearance because shortest paths tend to graze the 
corners. In fact, the problem is ill posed because Cf ree is an open set. For any path 
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Figure 6.9: A bitangent edge must touch two reflex vertices that are mutually 
visible from each other, and the line must extend outward past each of them 
without poking into C b s . 

r : [0, 1] — ► Cf ree , it is always possible to find a shorter one. For this reason, we 
must consider the problem of determining shortest paths in cl(Cf ree ), the closure 
of Cf ree . This means that the robot is allowed to "touch" or "graze" the obstacles, 
but it is not allowed to penetrate them. To actually use the computed paths as 
solutions to a motion planning problem, they need to be slightly adjusted so that 
they come very close to C Q b s , but do not make contact. This will slightly increase 
the path length, but this additional cost can be made arbitrarily small as the path 
approaches touching C ofes . 

The shortest path roadmap, G, is constructed as follows. Let a reflex vertex 
be a polygon vertex for which the interior angle (in Cf ree ) is greater than tt. All 
vertices of a convex polygon (in general position) are reflex vertices. The vertices 
of G are the reflex vertices. Edges of G are formed from two different sources: 

Consecutive reflex vertices: If two reflex vertices are the endpoints of 
an edge of C bsi then a corresponding edge is made in G. 

Bitangent edges: If a bitangent line can be drawn through a pair of reflex 
vertices, then a corresponding edge is made in G. A bitangent line, deppicted 
in Figure ??, is a line that is incident to two or more reflex vertices and does 
not poke into the interior of C b s at any of these vertices. Furthermore, all 
of these vertices must be mutually visible from each other. 

An example of the resulting roadmap is shown in Figure 6.10. Note that the 
roadmap may have isolated vertices, such as the one at the top of the figure. To 
solve a query and q g , both configurations are connected to all roadmap vertices 
that are visible; this is shown in Figure 6.11. This makes an extended roadmap 
that is searched for a solution. If Dijkstra's algorithm is used, and if each edge is 
given a cost that corresponds to its path length, then the resulting solution path 
will be the shortest path between qi and q g . The shortest path for the example 
Figure 6.11 is shown in Figure 6.12. 
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Figure 6.10: The shortest path roadmap includes edges between consecutive reflex 
vertices on C Q b s and also bitangent edges. 

If the bitangent tests are performed naively, then the resulting algorithm will 
require 0(n 3 ) time, in which n is the number of vertices of C t, s . There are 0(n 2 ) 
pairs of reflex vertices that need to be checked, and each check requires 0(n) time 
to make certain that no other edges prevent their mutual visibility. The plane 
sweep principle from Section 6.2.2 can be adapted to obtain a better algorithm, 
which takes only 0(n 2 Ign) time. The idea is to perform a radial sweep from each 
reflex vertex, v. A ray is started at 9 = 0, and events occur when the ray touches 
vertices. A set of bitangents through v can be computed in this way in O(nlgn) 
time. Since there are 0(n) reflex vertices, the total running time is 0(n 2 lgn). See 
Chapter 15 of [189] for more details. There exists an algorithm that can compute 
the shortest path roadmap in time 0(n 2 + m), in which m is the total number of 
edges in the roadmap [595]. 

The shortest path roadmap can be implemented without the use of trigono- 
metric functions. This greatly improves the numerical robustness of the algo- 
rithm. For a sequence of three points, p±, P2, P3, define the left turn predicate 
fi : R 2 x R 2 x R 2 — > {true , false } as fi(p\,Pv,Pz) = true if and only if p 3 
is to the left of the ray that starts at p\ and pierces p 2 . A point, p 2 , is a reflex 
vertex if and only if fi(pi,p 2 ,P3) = true , in which p\ and p 3 are the points be- 
fore and after, respectively, along the boundary of C Q b s . The bitangent test can be 
performed by assigning points as shown in Figure 6.13. A pair, p 2 , p$, of vertices 
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Figure 6.11: To solve a query, and q g are connected to all visible roadmap 
vertices, and graph search is performed. 

should receive a bitangent edge if the following sentence is false : 

[fl(Pi,P2,Ps) © fi(pi,P2,Ps)] V [fi(p2,P5,Pe) © fl(P4,Ps,P6)\, (6.1) 

in which © denotes logical "exclusive or". The fi predicate can be implemented 
without trigonometric functions by defining 

/I X! yA 

M{ Pl ,p 2 ,p 3 ) =1 x 2 y 2 \, (6.2) 
in which = (xj,?/j). If det(M) > 0, then lf( P i,P2,P3) = TRUE ; otherwise, 

lf(pi,P2,P3) = FALSE . 

6.3 Cell Decompositions 

Section 6.2.2 introduced the vertical cell decomposition to solve the motion plan- 
ning problem when C b s is polygonal. It is important to understand, however, that 
this is just one choice among many for the decomposition. Some of these choices 
may not be preferable in 2D, however, they might generalize better to higher di- 
mensions. Therefore, other cell decompositions are covered in this section, which 
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Figure 6.12: The shortest path in the extended roadmap is the shortest path 
between qt and q g . 

provides a smoother transition from vertical cell decomposition to cylindrical al- 
gebraic decomposition in Section 6.4, which solves the motion planning problem 
in any dimension for any semi- algebraic models. Along the way, a cylindrical 
decomposition will appear in Section 6.3.4 for the special case of a line-segment 
robot inW = R 2 . 

6.3.1 General Definitions 

In this section, the term complex will be used to refer to a collection of cells 
together with their boundaries. A partition into cells can be derived from a 
complex, but the complex contains additional information that describes how the 
cells must fit together. The term cell decomposition will still refer to the partition 
of the space into cells, which is derived from a complex. 

It is tempting to define complexes and cell decompositions in a very general 
manner. Imagine that any partition of C/ ree could be called a cell decomposition. 
A cell could be so complicated, that the notion would be useless. Even C/ ree itself 
could be declared as one big cell. It will be more useful to build decompositions 
out of simpler cells, such as ones that contain no holes. Formally, we will require 
that every k- dimensional cell is homeomorphic to B k C an open fc-dimensional 
unit ball. From a motion planning perspective, this still yields cells that are quite 
complicated, and it will be up to the particular cell decomposition method to 
enforce further constraints to yield a complete planning algorithm. 

Two different complexes will be introduced. The simplicial complex is ex- 
plained because one of the easiest to understand. Although it is useful in many 
applications, it is not powerful enough to represent all of the complexes that arise 
in motion planning. Therefore, the singular complex is also introduced. Although 
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Figure 6.13: The bitangents can be determined by checking for left turns, which 
avoids the use of trigonometric functions and their associated numerical problems. 

this one is more complicated to define, it encompasses all of the cell complexes 
that are of interest in this book. It is provides an elegant way to represent topo- 
logical spaces. Another important cell complex which is not covered here is the 
CW-complex [317]. 

Simplicial Complex For this definition, it is assumed that X = M. n . Let pi, 
P2, • • •, Pk+i, be k < n + 1 linearly-independent 5 points in R". A fc-simplex, 
[pi, . . . ,Pk+i] is formed from these points as 

\pi, ■ ■ ■ ,Pk+i] = G M n | < «j < 1 for any 1 < i < A; + 1 1 , (6.3) 

in which onpi is scalar multiplication of on by each of the point coordinates. An- 
other way to view (6.3) is as the convex hull of the k + 1 points (i.e., all ways to 
linearly interpolate between them). If k = 2, a triangular region is obtained. For 
k = 3, a tetrahedron is produced. 

For any A;-simplex, set one of the on to for any % for which 1 < i < k + 1. 
This yields a (k — l)-dimensional simplex which is called a face of the original 
simplex. A 2-simplex has three faces, each of which is a 1-simplex that may be 
called an edge. Each 1-simplex (or edge) has two faces, which are O-simplexes 
called vertices. 

To form a complex, the simplexes will be required to fit together in a nice way. 
This yields a high-dimensional notion of a triangulation, which in M. 2 is a tiling 
of triangular regions. A simplicial complex, JC, is a finite set of simplexes that 
satisfies the following: 

5 Form k vectors by subtracting pi from the other k points. Arrange the vectors into a k x n 
matrix. For linear independence, there must be at least one k x k cofactor with a nonzero 
determinant. For example, if k = 2, then the 3 points cannot be coplanar. 
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Not a simplicial complex 



A simplicial complex 



Figure 6.14: To become a simplicial complex, the simplex faces must fit together 
nicely. 

1. Any face of a simplex in /C is also in 1C. 

2. The intersection, A 1 fl A 2 , of any two simplexes A 1; A 2 G /C is either empty, 
or A x fl A 2 is a common face of both A 1 and A 2 . 

Figure 6.14 illustrates these requirements. For k > 0, a A;-cell of /C is defined to 
be interior int([pi, . . . ,Pk+i]) of any A;-simplex. For k = 0, every 0-simplex can 
also be considered as a 0-cell. The union of all of the cells forms a partition of the 
point set covered by /C. This therefore provides a cell decomposition in a sense 
that is consistent with Section 6.2.2. 

Singular Complex Simplicial complexes are useful in applications such as ge- 
ometric modeling and computer graphics for computing the topology of models. 
Due to the complicated topological spaces and decomposition algorithms that 
arise in motion planning, they will be insufficient for the most general problems. 
A singular complex is a generalization of the simplicial complex. Instead of being 
limited to M. n , let a singular complex be defined for any (Hausdorff) topological 
space, X. The main difference is that for a simplicial complex, each simplex is a 
subset of R"; however, for a singular complex, each singular simplex is actually a 
homeomorphism from a (simplicial) simplex in R n to a subset of X. 

To help understand the idea, first consider a ID singular complex, which hap- 
pens to be a topological graph (this was introduced in Example 4.1.6, and has 
been used extensively). The interval [0, 1] is a 1-simplex, and a continuous path 
r : [0, 1] — > X is a singular 1-simplex because it is a homeomorphism of [0, 1] to 
the image of r in X. Suppose G(V, E) is a topological graph. The cells are subsets 
of X that are defined as follows. Each point v G V is a 0-cell in X. To follow the 
formalism, each can be considered as the image of a function / : {0} — > X, which 
makes it a singular 0-simplex, because {0} is a 0-simplex. For each path r e E, 
the corresponding 1-cell is 



{x E X | t(s) = x for some s E (0, 1)}. 



(6.4) 
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Expressed differently, it is r((0, 1)), the image of the path r, except that the 
endpoints are removed because they are already covered by the 0-cells (the cells 
must form a partition). 

These principles will now be generalized to higher dimensions. Since a balls 
and simplexes of the same dimension are homeomorphic, balls can be used instead 
of a simplex in the definition of a singular simplex. Let B k C W 1 for k < n denote 
a closed, <i-dimensional unit ball, 

B k = {x G W 1 | Hxll < 1}, (6.5) 

in which || • || is the Euclidean norm. A singular k-simplex is a continuous mapping 
o~ '. B k — > X. Let int(B k ) refer to the interior of B k . For k > 1, the k-cell, C, 
corresponding to a singular A;-simplex, a, is the image C = a(int(B d )) C X. 
The 0-cells are obtained directly as the images of the singular simplexes. Each 
singular 0-simplex maps to the 0-cell in X. When a is restricted to int(B d ), it 
actually defines a homeomorphism between B d and C. Note that both of these 
are open sets if al > 0. 

A simplicial complex required that the simplexes fit together nicely. The same 
concept is applied here, but topological concepts are used instead because they 
are more general. Let JC be a set of singular simplexes of varying dimensions. Let 
Sk denote the union of the images of all singular 2-simplexes for all % < k. 

A collection of singular simplexes that map into a topological space X is called 
a singular complex if 

1. For each dimension k, the set Sk Q X must be closed. This means that the 
cells must all fit together nicely. 

2. Each d-cell is an open set in the topological subspace Sd- Note that 0-cells 
are open in So, even though they are usually closed in X. 

Example 6.3.1 (Vertical decomposition) The vertical decomposition of Sec- 
tion 6.2.2 is a nice example of a singular complex that is not a simplicial complex 
because it contains trapezoids. The interior of each trapezoid and triangle forms 
a 2-cell, which is an open set. For every pair of adjacent 2-cells, there is a 1-cell on 
their common boundary. There are no 0-cells because the vertices lie in C a b s , not 
Cf ree . The subspace S 2 is formed by taking the union of all 2-cells and 1-cells to 
yield S2 = Cf ree . This does satisfy the closure requirement because the complex 
is built in C/ ree only; hence, the topological space is C/ ree - The set S 2 = Cf ree is 
both open and closed. The set Si is the union of all 1-cells. This is also closed 
because the 1-cell endpoints all lie in C obs . Each 1-cell is also an open set. 

One way to avoid some of these strange conclusions from the topology re- 
stricted to Cf ree is to build the vertical decomposition in cl(Cf ree ), the closure 
of Cf r ee- This can be obtained by starting with the previously-defined vertical 
decomposition, and adding a new 1-cell for every edge of C bs, and a 0-cell for 
every vertex of C Q b s . Now S3 = cl(Cf ree ) : which is closed in M 2 . Likewise, S2, Si, 
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and So, are closed in the usual way. Each of the individual <i-dimensional cells, 
however, is open in the topological space Sd- The only strange case is that the 
0-cells are considered open, but this is true in the discrete topological space Sq. M 



6.3.2 2D Decompositions 

The vertical decomposition method of Section 6.2.2 is just one choice of many 
cell decomposition methods for solving the problem when C Q b s is polygonal. It 
provides a nice balance between the number of cells, computational efficiency, 
and implementation ease. It is usually possible to decompose C Q b s into far fewer 
convex cells. This would be preferable for multiple-query applications because less 
paths would be needed in the search graph. If is unfortunately quite difficult to 
optimize the number of cells. Determining the decomposition of a polygonal C b s 
with holes that uses the smallest number of convex cells is NP-hard [499, 390]. 
Therefore, we are willing to tolerate non-optimal decompositions. 

Triangulation One alternative to vertical decomposition is to perform a tri- 
angulation, which yields a simplicial complex over Cf ree . Figure 6.15 shows an 
example. Because Cf ree is an open set, there are no 0-cells. Each 2-simplex (tri- 
angle) has either three, two, or one face, depending on how much of its boundary 
is shared with C b s - A roadmap can be made by connecting the samples for 1-cells 
and 2-cells as shown in Figure 6.16. Note that there are many ways to triangulate 
Cf ree for a given problem. The problem of finding good triangulations, which for 
example means trying to avoid thin triangles, is given considerable attention in 
computational geometry [83, 189, 219]. 




Figure 6.15: A triangulation of Cobs- 



How can the triangulation be computed? It might seem tempting to run the 
vertical decomposition algorithm of Section 6.2.2 and split each trapezoid into 



266 



S. M. LaValle: Planning Algorithms 




Figure 6.16: A roadmap obtained from the triangulation. 



two triangles. Even though this leads to triangular cells, it does not produce a 
simplicial complex (two triangles could abut the same edge of a triangle) . A naive 
approach is to incrementally split faces by attempting to connect two vertices 
of a face by a line segment. If this segment does not intersect other segments, 
then the split can be made. This process can be iteratively performed over all 
vertices of faces that more than three vertices, until a triangulation is eventually 
obtained. Unfortunately, this results in an 0(n 3 ) time algorithm because 0(n 2 ) 
pairs must be checked in the worst case, and each check requires 0(n) time to 
determine whether an intersection occurs with other segments. This can be easily 
reduced to 0{n 2 \gn) by performing radial sweeping. Chapter 3 of [189] presents 
an algorithm that runs in O(nlgn) time by first partitioning Cf ree into monotone 
polygons, and then efficiently triangulating each monotone polygon. If Cf ree is 
simply connected, then surprisingly, a triangulation can be computed in linear 
time [137]. Unfortunately, this algorithm is too complicated to use in practice 
(there are, however, simpler algorithms who complexity is close to 0(n); see the 
end of Chapter 3 of [189] for a survey). 

Cylindrical decomposition The cylindrical decomposition is very similar to 
the vertical decomposition, except that when any of the cases in Figure 6.2 oc- 
curs, then a vertical line slices through all faces, all of the way from y = — oo 
to y = oo. The result is shown in Figure 6.17, which may be considered as a 
singular complex. This may appear very inefficient in comparison to the verti- 
cal decomposition; however, it is presented here because it generalizes nicely to 
any dimension, configuration space topology, and semi-algebraic models. There- 
fore, it is presented here to ease the transition to the general decompositions. 
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Figure 6.17: The cylindrical decomposition differs from the vertical decomposition 
in that the rays continue forever instead of stopping at the nearest edge. Compare 
this figure to Figure 6.6. 

The most important property of the cylindrical decomposition is shown in Figure 
6.18. Consider each vertical strip between two events. When traversing a strip 
from y = — oo to oo, the points alternate between being C t, s and C/ ree . For exam- 
ple, between events 4 and 5, the points below edge / are in Cf ree . Points between 
/ and g lie in C b s - Points between g and h lie in C/ ree , and so forth. The cell 
decomposition can be defined so that 2D cells are also created in C obs . Let S(x, y) 
denote the logical predicate (3.5) from Section 3.1.1. When traversing a strip, the 
value of S(x, y) also alternates. This behavior is the main reason to construct a 
cylindrical decomposition, which will become very valuable in Section 6.4.2. Each 
vertical strip is actually considered to be a cylinder, hence, the name cylindrical 
decomposition (i.e., there are not necessarily any cylinders in the 3D geometric 
sense) . 

6.3.3 3D Vertical Decomposition 

It turns out that the vertical decomposition method of Section 6.2.2 can be ex- 
tended to any dimension n by recursively applying the sweeping idea. The method 
requires, however, that C b s must be piecewise linear. In other words, C b s is repre- 
sented as a semi-algebraic model for which all primitives are linear. Unfortunately, 
most of the general motion planning problems involve nonlinear algebraic primi- 
tives because of the nonlinear transformations that arise from rotations. Recall the 
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Figure 6.18: The cylindrical decomposition produces vertical strips. Inside of a 
strip, there is a stack of collision-free cells, separated by C Q b s - 

complicated algebraic C Q b s model constructed in Section 4.3.3. To handle generic 
algebraic models, powerful techniques from computational algebraic geometry are 
needed. This will be covered in Section 6.4. 

One interesting planning problem in which C b. s is piecewise linear is for a 
polyhedral robot that can translate in M 3 , and the obstacles in W are polyhedra. 
Because the transformation equations are linear in this case, C Q b s C M 3 is polyhe- 
dral. The polygonal faces of C Q b s are obtained by forming geometric primitives for 
each of the Type FV, Type VF, and Type EE cases of contact between A and 0, 
as mentioned in Section 4.3.2. 

Figure 6.19 illustrates the algorithm that constructs the 3D vertical decom- 
position. Compare this with the algorithm in Section 6.2.2. Let (x,y,z) denote 
points in C = M 3 . The vertical decomposition yields convex 3-cells, 2-cells, and 
1-cells. Neglecting degeneracies, a generic 3-cell is bounded by 6 planes. The cross 
section of a 3-cell, for some fixed x value will yield a trapezoid or triangle, exactly 
as in the 2D case, but in a plane parallel to the YZ plane. Two sides of a generic 
3-cell are parallel to the YZ plane, and two other sides are parallel to the XZ 
plane. It is bounded above and below by polygonal two polygonal faces of C Q b s . 

Initially, sort the C Q b s vertices by their X coordinate to obtain the events. 
Now consider sweeping a plane perpendicular to the X axis. The plane for a 
fixed value of x produces a two-dimensional, polygonal slice of C Q b s . Three such 
slices are shown at the bottom of Figure 6.19. Each slice is parallel to the YZ 
plane, and appears to look exactly like a problem that can be solved by the 2D 
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vertical decomposition method. The 2-cells in a slice are actually slices of 3-cells 
in the 3D decomposition. The only places in which these 3-cells can change in an 
important way is when the sweeping plane stops at some x value. The center slice 
in Figure 6.19 corresponds to the case in which a vertex of a convex polyhedron 
is encountered, and all of the polyhedron lies to right of the sweep plane (i.e., 
it has not been encountered yet). This corresponds to a place where a critical 
change must occur in the slices. These are 3D versions of the cases in Figure 
6.2, which indicate how the vertical decomposition needs to be updated. The 
algorithm proceeds by first building the 2D vertical decomposition at the first x 
event. At each event, the 2D vertical decomposition must be updated to take into 
the critical changes. During this process, the three dimensional cell decomposition 
and roadmap can be incrementally constructed, just as in the 2D case. 

The roadmap is constructed by placing a sample point in the center of each 
3-cell and each 2-cell. The vertices are the sample points, and edges are added to 
the roadmap by connecting the sample points of adjacent pairs 3-cells and 2-cells. 

This same principle can be extended to any dimension, but the applications 
to motion planning are limited because the method requires linear models (or at 
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Figure 6.20: Motion planning for a line segment that can translate and rotate in 
a 2D world. 

least is very challenging to adapt to nonlinear models; in some special cases, this 
can be done). 

6.3.4 A Decomposition for a Line-Segment Robot 

This section presents a one of the simplest cell decompositions that involves non- 
linear models, yet it is already fairly complicated. This will help to give an 
appreciation of the difficulty of combinatorial planning in general. Suppose the 
planning problem is as shown in Figure 6.20. The robot, A, is a single line seg- 
ment that can translate or rotate in W = I 2 . The dot on one end of A is used 
to illustrate its origin, and is not part of the model. The configuration space, C, 
is homeomorphic to M 2 x S 1 . Assume that the parameterization M 2 x [0, 2ir]/ ~ 
is used in which the identification equates 9 = and 9 = 2n. A point in C is 
represented as (x,y,9). 

First consider making a cell decomposition for the case in which the segment 
can only translate. The method from Section 4.3.2 can be used to compute C b s 
by treating the robot-obstacle interaction with Type EV and Type VE contacts. 
When the interior of A touches an obstacle vertex, then Type EV is obtained. An 
endpoint of A touching an object interior yields Type VE. Each case produces an 
edge of C b s , which is polygonal. Once this is represented, the vertical decompo- 
sition can be used to solve the problem. This may inspire a reasonable numerical 
approach to the rotational case, which is to discretize 9 into K values, iA9, for 
< i < K, and A9 = 2ti/K [11]. The obstacle region, C Q b s , will be polygonal for 
each case, and we can imagine having a stack of K polygonal regions. A roadmap 
can be formed by connecting sampling points inside of a slice in the usual way, and 
also connecting samples between corresponding cells in neighboring slices. If K is 
large enough, this strategy could work quite well, but the method is not complete 
because a sufficient value for K cannot be determined in advance. The method 
is actually an interesting hybrid between being combinatorial and sampling-based 
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Figure 6.21: Fix (x,y), and swing the segment around for all values of 9 6 
[0, 2tt]/ ~. a) Note the vertex and edge features that are hit by the segment, 
b) Record orientation intervals over which the robot is not in collision. 



motion planning. A resolution complete version can be imagined. 

In the limiting K tends to infinity, the surfaces of C b s will be curved 

along the 9 direction. The conditions in Section 4.3.3 must be applied to generate 
the actual obstacle regions. This is possible, but this yields a semi-algebraic 
representation of C Q b s in terms of implicit polynomial primitives. It is no easy 
task to determine an explicit representation in terms of simple cells that can be 
used to motion planning. The method of Section 6.3.3 cannot be used because 
C b s is not polyhedral. Therefore, special analysis is warranted to produce a cell 
decomposition. 

The general idea is to construct a cell decomposition in M 2 by considering only 
the translation part, (x, y). Each cell in M. 2 will then be lifted into C by considering 
9 as a third axis that is "above" the XY plane. The result will be a cylindrical 
decomposition in which each cell in the XY plane produces a cylindrical stack 
of cells for different 9 values. Recall the cylinders in Figures and 6.17 and 6.18. 
The vertical axis corresponds to 9 in the current setting, and the horizontal axis 
is replaced by two axes, X and Y. 

To construct the decomposition in M 2 , consider the various robot-obstacle 
contacts shown in Figure 6.21. In Figure 6. 21. a, the segment swings around 
from a fixed (x,y). Two different kinds of contacts arise. For some orientation 
(value of 9), the segment contacts v±, forming a Type EV contact. For three 
other orientations, the segment contacts an edge, forming Type VE contacts. 
Once again using the feature concept, there are four orientations at which the 
segment contacts a feature. Each feature may be either a vertex or an edge. 
Between the two contacts with €2 and e$, the robot is not in collision. These 
configurations lie in C/ ree . Also, configurations for which the robot is between 
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Figure 6.22: If x is increased enough, a critical change occurs in the radar map 
because V\ can no longer be reached by the robot. 

contacts e 3 (the rightmost contact) and v i, are also in Cf ree . All other orientations 
produce configurations in C t, s . Note that the line segment cannot get from being 
between e 2 and e 3 to being between e 3 and v±, unless the (x, y) position is changed. 
It therefore seems sensible that these must correspond to different cells in whatever 
decomposition is made. 

Figure 6.21.b illustrates which values of 9 produce collision. We will refer to 
this representation as a radar map. The four contact orientations are indicated by 
the contact feature. The notation [e$, v\] and [e 2 , 63] can be used to identify the 
two intervals for which (x,y,6) G C/ ree . Now imagine changing (x,y) by a small 
amount, to obtain (x',y'). How would the radar map change? The precise angles 
at which the contacts occur would change, but the notation [e 3 ,t>i] and [e 2 ,e 3 ], 
for configurations that lie in Cf ree remains unchanged. Even though the angles 
change, there is no interesting change in terms of the contacts; therefore, it makes 
sense to declare (x, y, 6) and (x, y, 6') to lie in the same cell in Cf ree , because 9 and 
6' both place the segment between the same contacts. Imagine a column of two 
3-cells above a small area around {x,y). One 3-cell is for orientations in [e 3 ,t>i], 
and the other is for orientations in [e 2 , e 3 ]. These appear to be 3D regions in Cf ree 
because each of x, y, and 9 can be perturbed a small amount without changing 
the cell. 

Of course, if (x, y) is changed enough, then at some point we expect a dramatic 
change to occur in the radar map. For example, imagine e 3 is infinitely long, and 
the x value is gradually increased in Figure 6. 21. a. The black band between v\ 
and e 2 in Figure 6.21.b will shrink in length. Eventually, when the distance from 
(V, y') to v\ is greaster than the length of A, the black band will disappear. This 
situation is shown in Figure 6.22. The change is very important to notice because 
after that region vanishes, any orientation, 8' between e 3 and e 3 , traveling the 
long way around the circle, will produce a configuration (x',y',9') G Cf ree . This 
seems very important because it tells us that we can travel between the original 
two cells by moving the robot further way from V\, rotating the robot, and then 



6.3. CELL DECOMPOSITIONS 



273 



moving back. Now move from the position shown in Figure 6.22 into the positive Y 
direction. The remaining black band will begin to shrink, and will finally disappear 
when the distance to is further than the robot length. This represents another 
critical change. 

The radar map can be characterized by specifying a circular ordering 

([/i, /2], [/s, U], If 5, /e], • • • , [/2*-i, f2k]), (6.6) 

when there are k orientation intervals over which the configurations lie in C/ ree . 
For the radar map in Figure 6.21.b, this representation yields ([e3, vi], [e2, 63]). 
Each is a feature, which may be an edge or a vertex. Some of the /$ may 
be identical; the representation for Figure 6.22.b is ([e 3 ,e 3 ]). The intervals are 
specified in counterclockwise order around the radar map. Because the ordering 
is circular, it does not matter which interval is specified first. There are two 
degenerate cases. If (x,y,9) G C/ ree for all 9 G [0, 2tt), then we can write () for 
the ordering. On the other hand, if (x, y, 9) G C b s for all 9 G [0, 2tt), then we just 
write 0. 

Now we are prepared to explain the cell decomposition in more detail. Imagine 
traveling along a path in IR 2 , and producing an animated version of the radar 
map in Figure 6.21.b. We say that a critical change occurs each time the circular 
ordering representation of (6.6) is forced to change. Changes occur when intervals: 
1) appear, 2) disappear, 3) split apart, 4) merge into one, or 5) when the feature 
of an interval changes. The first task is to partition IR 2 into maximal 2-cells over 
which no critical changes occur. Each one of these 2-cells, R, will represent the 
projection of a strip of 3-cells in C/ ree . Each 3-cell is defined as follows. Let 
{R, [fi, fi+i]} denote the three dimensional region in C for which (x,y) G R and 
9 places the segment between contacts fi and fi+i- The cylinder of cells above R 
is given by {R, [fi, fi+i]} for each interval in the circular ordering representation, 
(6.6). If any orientation is possible because A never contacts an obstacle while in 
R, then we write {R}. 

What are the positions in M 2 that cause critical changes to occur? It turns 
out that there are five different cases to consider, each of which produces a set of 
critical curves in IR 2 . When one of these curves is crossed, a critical change occurs. 
If none of these curves is crossed, then no critical change can occur. Therefore, 
these curves will precisely define the boundaries of our desired 2-cells in IR 2 . Let 
L denote the length of the line segment, A. 

Two of the five cases have already been observed in Figures 6.21 and 6.22. 
These appear in Figures 6. 23. a and Figures 6.23.b, and occur if (x,y) is within L 
of an edge or a vertex. The third and fourth cases are shown in Figures 6.23.C 
and 6.23.d, respectively. The third case occurs because crossing the curve causes 
A to change between being able to touch e and being able to touch v. This must 
be extended from any edge at an endpoint that is a reflex vertex (interior angle 
is greater than n). The fourth case is actually a resurfacing of the bitangent case 
from Figure 6.9, which arose for the shortest path graph. If the vertices are within 
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c. d. 



Figure 6.23: Four of the five cases that produce critical curves in R 2 . 

L of each other, then a linear critical curve is generated because A is no longer able 
to touch t> 2 when crossing it from right to left. Bitangents always produce curves 
in pairs; the curve above t>2 is not shown. The final case, shown in Figure 6.24, is 
the most complicated. It is a fourth-degree algebraic curve called the Conchoid 
of Nicomedes, which arises from A being in simultaneous contact between v and 
e. Inside of the teardrop-shaped curve, A can contact e but not v. Just outside 
of the curve, it can touch v. If the XY coordinate frame is placed to that v is the 
(0, 0) origin, then the equation of the curve is 

( x *-y2)(y + d) 2 -y 2 L 2 = 0, (6.7) 

in which d is the distance from v to e. 

Putting all of the curves together generates a cell decomposition of M 2 . There 
are noncritical regions, over which there is no change in (6.6), which form the 2- 
cells. The boundaries between adjacent 2-cells are sections of the critical curves, 
and form 1-cells. There are also 0-cells at places where critical curves intersect. 
Figure 6.25 shows an example adapted from [437]. Note that critical curves are not 
drawn if their corresponding configurations are all in C & s . The method still works 
correctly if they are included, but unnecessary cell boundaries will be made. Just 
for fun, they could be used to form a nice cell decomposition of C & s , in addition to 
Cf ree . Since C bs is avoided, is seems best to avoid wasting time on decomposing 
it. These unnecessary cases can be detected by imagining that A is a laser with 
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Figure 6.24: The fifth case is the most complicated. It results in a fourth degree 
algebraic curve called the Conchoid of Nicomedes. 

range L. As the laser sweeps around, only features that are contacted by the laser 
are relevant. Any features that are hidden from view of the laser correspond to 
unnecessary boundaries. 

After the cell decomposition has been constructed in M 2 , it needs to be lifted 
into M 2 x [0, 2ir]/ ~. This generates a cylinder of 3-cells above each 2D noncritical 
region, R. The roadmap could easily be defined to have a vertex for every 3-cell 
and 2-cell, which would be consistent with previous cell decompositions; however, 
vertices at 2-cells will not be generated here to make the coming example easier to 
understand. Each 3-cell, {R, [fi, fi+i]}, will correspond to the vertex in a roadmap. 
The roadmap edges will connect neighboring 3-cells that have a 2-cell as part of 
their common boundary. This means that in R 2 they share a ID portion of a 
critical curve. 

The problem is to determine which 3-cells are actually adjacent. Figure 6.26 
depicts the cases in which connections need to be made. The XY plane is rep- 
resented as one axis (imagine looking in a direction parallel to it). Consider two 
neighboring 2-cells (noncritical regions), R and 1Z', in the plane. It is assumed 
that a 1-cell (critical curve) in R 2 separates them. The task is to connect together 
3-cells in the cylinders above R and R 1 . If neighboring cells share the same feature 
pair, then they are connected. This means that {R, [fi, /i+i]} and {R', [fi, 
must be connected. In some cases, one feature may change, while the interval of 
orientations remains unchanged. This may happen, for example, then the robot 
changes from contacting an edge to contacting a vertex of the edge. In these cases, 
a connection must also be made. One case illustrated in Figure 6.26 is when a 
splitting or merging of orientation intervals occurs. Traveling from R to R', the 
figure shows two regions merging into one. In this case, connections must be made 
from each of the original two 3-cells to the merged 3-cell. When constructing the 
roadmap edges, sample points both the 3-cells and 2-cells should be used to ensure 
collision-free paths are obtained, as in the case of the vertical decomposition in 
Section 6.2.2. Figure 6.27 depicts the cells for the example in Figure 6.25. Each 
noncritical region has between one and three cells above it. Each of the various 
cells is indicated by a shortened robot that points in the general direction of the 
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R R! 

Figure 6.26: Connections are made between neighboring 3-cells that lie above 
neighboring noncritical regions. 
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Figure 6.27: A depiction of the 3-cells above the noncritical regions. 



cell. The connections between the cells are also shown. Using the noncritical 
region and feature names from Figure 6.25, the resulting roadmap is depicted 
abstractly in Figure 6.28. Each vertex represents a 3-cell in C/ ree , and each edge 
represents the crossing of a 2-cell between adjacent 3-cells. To make the roadmap 
consistant to previous roadmaps, we could insert a vertex into every edge, and 
force the path to travel through teh samplg point of the corresonding 2-cell. 

Once the roadmap has been constructed, it can be used in the same way as 
other roadmaps in this chapter to solve a query. Many implementation details have 
been neglected here. Because of the fifth case, some of the region boundaries in IR 2 
are fourth degree algebraic curves. Ways to prevent the explicit characterization of 
every noncritical region boundary, and other implementation details, are covered 
in [34]. Some of these details are also summarized in [437]. 

How many cells can there possibly be in the worst case? First count the 
number of noncritical regions in M. 2 . There are 0(n) different ways to generate 
critical curves of the first three types because each correspond to a single feature. 
Unfortunately, there are 0(n 2 ) different ways to generate bitangents and the Con- 
choid of Nicomedes because these are based on pairs of features. Assuming no 
self-intersections, a collection of 0(n 2 ) curves in M 2 , may intersect to generate at 
most 0(n 4 ) regions. Above each noncritical region in M 2 , there could be a cylinder 
of 0(n) 3-cells. Therefore, the size of the cell decomposition is 0(n 5 ) in the worst 
case. In practice, however, it is highly unlikely that all of these intersections will 
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Figure 6.28: The roadmap corresponding to the example in Figure 6.25. 

occur, and the number of cells is expected to be reasonable. In [675], an 0(n 5 )- 
time algorithm is given to construct the cell decomposition. Other algorithms, 
which have much better running time are mentioned in Section 6.5.3, but they 
are much more complicated to understand and implement. 



6.4 Computational Algebraic Geometry 

This section presents algorithms that are so general that they solve any problem 
of Formulation 4.3.1 and even the kinematic closure problems of Section 4.4. It 
is amazing that such algorithms exist; however, it is also unfortunate that they 
are both extremely challenging to implement and not efficient enough for most 
applications. The concepts and tools here were mostly developed in the context 
of computational real algebraic geometry [58, 178]. They are powerful enough 
to conquer numerous problems in robotics, computer vision, geometric modeling, 
computer-aided design, and geometric theorem proving. One of these problems 
happens to be motion planning, for which the connection to computational alge- 
braic geometry was first recognized in [676]. 

6.4.1 Basic Definitions and Concepts 

This section builds on the semi-algebraic definitions from Section 3.1 and the 
polynomial definitions from Section 4.4.1. It will be assumed that C C M n , which 
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could for example arise by representing each copy of 5*0(2) or SO (3) in its 2 x 2 
or 3 x 3 matrix form. For example, in the case of a 3D rigid body, we know 
that C=R 3 x MP 3 , which is a six-dimensional manifold, but it can be embedded 
in M 12 , which is obtained from the Cartesian product of all 3 x 3 matrices and 
IR 3 . The required constraints that for rotation matrices to lie SO (2) or SO (3) are 
polynomials, and can therefore be added to the semi-algebraic models of C b s and 
Cf ree . If the dimension of C is less than n, then the algorithm presented below is 
sufficient, but there are some representation and complexity issues that motivate 
using a special parameterization of C to make both dimensions the same while 
altering the topology of C to become homeomorphic to W 1 . This will be discussed 
briefly in Section 6.4.2. 

Suppose that the models in R n are all expressed using polynomials from 
Q[xi, . . . ,x n ], the set of polynomials 6 over the field of rational numbers Q. Let 
/ G Q[xi, . . . , x n ] denote a polynomial. 

Tarski sentences Recall the logical predicates that were formed in Section 3.1. 
They will be used again here, but here they are defined with a little more flexibility. 
For any / G Q[xi, . . . , x n ], an atom is an expression of the form / ex 0, in which cxi 
may be any relation in the set {=, 7^, <, >, <, >}. In Section 3.1, such expressions 
were used to define logical predicates. Here we assume that relations other than 
< can be used, and that the vector of polynomial variables lies in W 1 . 

A quantifier-free formula <f>( logical predicate composed of atoms 

and logical connectives, "and", "or", and "not", which are denoted by A, V, and 
-1, respectively. Each atom itself is considered as a logical predicate which yields 
true if and only if the relation is satisfied when the polynomial is evaluated at 
the point (xi, . . . , x n ) G R n . 

Example 6.4.1 An example of a predicate <fi over M 3 is 

(f)(x 1 ,x 2 , x 3 ) = (xlx 3 - x\ < 0) V [^(3:r 2 :r 3 ^ 0) A {2x\ - x x x 2 x z + 2 > 0)] . (6.8) 
The precedence order of the connectives follows the laws of Boolean algebra. ■ 

Let a quantifier, Q, be either of the symbols, V, which means "for all", or 3, 
which means "there exists". A Tarski sentence, $, is a logical predicate that may 
additionally involve quantifiers on some or all of the variables. In general, a Tarski 
sentence takes the form 

. . . ,x n - k ) = (Qzi)(Qz 2 ) ■ ■ ■ {Qz k ) [</>( Z\, ■ ■ ■ , Zk, X\, ■ ■ ■ , X n —k 

)] , (6-9) 

in which the Zi are the quantified variables, the Xi are the free variables, and 
is a quantifier-free formula. The quantifiers do not necessarily have to appear 

6 It will be explained shortly why Q[xi, . . . , x n ] is preferred over R[xi, . . . , x n \. 
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at the left to be a valid Tarski sentence; however, such expressions can always 
be manipulated into an equivalent expression that has all quantifiers in front, as 
shown in (6.9). The procedure for moving quantifiers to the front is [559]: 1) 
eliminate any redundant quantifiers; 2) rename some of the variables to ensure 
that the same variable does not end up appearing both free and bound; 3) move 
negation symbols as far inward as possible; 4) push the quantifiers to the left. 

Example 6.4.2 (Tarski sentences) Several examples are given. Tarski sen- 
tences that have no free variables are either true or false in general because 
there are no arguments on which the results depend. Here is an example, 

$ = \/x3y (x 2 -y < 0), (6.10) 

which is TRUE because for any x G R, some y G R can always be chosen so 
that y > x 2 . In the general notation of (6.9), this example becomes Qz\ = Vx, 
Qz 2 = 3y, and (f)(z 1 , z 2 ) = {x 2 - y < 0). 

Swapping the order of the quantifiers yields another Tarski sentence, 

$ = 3yVx (x 2 -y < 0), (6.11) 

which is false because for any y, there is always an x such that x 2 > y. 
Now consider a Tarski sentence that has a free variable: 

$(z) = 3yVx {x 2 - zx 2 -y < 0). (6.12) 

This yields a function $ : R — ► {true , false }, in which 

, , \ \ TRUE if Z > 1 1oN 

$(z) = < . , . (6.13) 

v ' \ FALSE if Z < 1 V ' 

An equivalent quantifier- free formula can be defined as <p(z) — {z > 1), which 
takes on the same truth values as the Tarski sentence in (6.12). This might make 
you wonder whether it is possible to make a simplification that eliminates the 
quantifiers. This is called the quantifier elimination problem, which will be ex- 
plained shortly. ■ 



The decision problem The examples in (6.10) and (6.11) lead to an interesting 
problem. Consider the set of all Tarski sentences that have no free variables. The 
subset of these that are true comprise the first-order theory of the reals. Can 
an algorithm be developed to determine whether such a sentence is true? This 
is called the decision problem for the first-order theory of the reals. At first 
it may appear hopeless because R n is uncountably infinite, and an algorithm 
must work with a finite set. This is the familiar issue faced throughout motion 
planning. Sampling-based approaches in Chapter 5 provided one kind of solution. 
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This idea could be applied to the decision problem, but the resulting lack of 
completeness would be similar. It is not possible to check all possible points in 
M. n by sampling. Instead, the decision problem can be solved by constructing 
a combinatorial representation that exactly represents the decision problem by 
partitioning R n into a finite collection of regions. Inside of each region, only one 
point needs to be checked. This should already seem related to cell decompositions 
in motion planning; it turns out that methods developed to solve the decision 
problem can also conquer motion planning. 

The quantifier elimination problem Another important problem was exem- 
plified in (6.12). Consider the set of all Tarski sentences of the form (6.9), which 
may or may not have free variables. Can an algorithm be developed that takes 
a Tarski sentence, and produces an equivalent quantifier-free formula, 0? Let 
xi, . . . ,x n denote the free variables. To be equivalent, both must take on the same 
true values over R n , which is the set of all assignments, (x±, . . . ,x n ), for the free 
variables. 

Given a Tarski sentence, (6.9), the quantifier elimination problem is to find a 
quantifier-free formula, <f> such that 

^(x u ...,x n ) = (j)(x 1 ,...,x n ) (6.14) 

for all (xi, . . . , x n ) G W 1 . This is equivalent to constructing a semi-algebraic model 
because can always be expressed in the form 

k mi 

0(xi,. . . ,x n ) = \J /\(f itj (x 1 ,...,x n )\xiO), (6.15) 
i=i j=i 

in which ix may be either <, =, >. This appears the same (3.5), except that 
(6.15) uses relations <, =, and > to allow open and closed semi-algebraic sets, 
whereas (3.5) only used < to construct closed semi-algebraic sets for O and A. 

Once again, the problem is defined on R n , which is uncountably infinite, but 
an algorithm must work with a finite representation. This will be achieved by the 
cell decomposition technique presented in Section 6.4.2. 

Semi-algebraic decomposition As stated in Section 6.3.1, motion planning 
inside of each cell in a complex should be trivial. To solve the decision and 
quantifier elimination problems, a cell decomposition was developed for which 
these problems become trivial in each cell. The decomposition is designed so that 
only a single point in each cell needs to be checked to solve the decision problem. 
The semi-algebraic set, Y C M n , that is expressed with (6.15) is 

k mi 

Y = U n ft 3 *' •■•- I ») er l fiA x u •••>**) = Sij} , (6.16) 

i=l 3=1 
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Figure 6.29: A semi-algebraic decomposition of the gingerbread face yields 9 sign- 
invariant regions. 

in which sgn is the sign function, and each Sij G { — 1, 0, 1}, which is the range of 
sgn. Once again the nice relationship set-theory and logic, which was described in 
Section 3.1, appears here. We convert from a set-theoretic description to a logical 
predicate by changing U and D to V and A. 

Let T denote the set of m — Yli=i m i polynomials that appear in (6.16). A sign 
assignment with respect to T is a vector-valued function, sgnjc- : W 1 — > { — 1, 0, l} m . 
Each / G has a corresponding position in the sign assignment vector. At this 
position, the sign, sgn(/(xi, . . . , x n )) G { — 1,0,1}, appears. A semi- algebraic 
decomposition is a partition of R n into a finite set of connected regions that are 
each sign invariant. This means that inside of each region sgnjr is must remain 
constant. The regions will not be refereed to as cells because a semi- algebraic 
decomposition is not necessarily a singular complex as defined in Section 6.3.1; 
the regions here may contain holes. 

Example 6.4.3 (Sign assignment) Recall Example 3.1.1 and Figure 3.4 from 
Section 3.1.2. Figure 3. 4. a shows a sign assignment for a case in which there is 
only one polynomial, T = {x 2 + y 2 — 4}. The sign assignment is defined as 

( -1 ifx 2 + y 2 -4<0 
sga r (x, y) = I ifx 2 + y 2 -4 = . (6.17) 
[ 1 ifx 2 + y 2 -4>0 

Now consider the sign assignment, sgnjr, shown in Figure 6.29 for the gin- 
gerbread face of Figure 3.4.b. The polynomials of the semi-algebraic model are 
T = {/i, f'2, /3, ik}, as defined in Example 3.1.1. In order, these are the "head", 
"left eye", "right eye", and "mouth". The sign assignment produces a four- 
dimensional vector of signs. Note that if (x,y) lies on one of the zeros of a 
polynomial in JF, a appears in the sign assignment. If the curves of two or more 
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of the polynomials had intersected, then the sign assignment would produce more 
than one at the intersection points. 

For the semi-algebraic decomposition for the gingerbread face in Figure 6.29, 
there are nine regions. Five two-dimensional regions correspond to 1) being outside 
of the face, 2)inside of the left eye, 3) inside of the right eye, 4) inside of the mouth, 
and 5) inside of the face, but outside of the mouth and eyes. There are four one- 
dimensional regions, each of which corresponds to points that lie on one of the 
zero sets of a polynomial. The resulting decomposition is not a singular complex 
because the (—1, 1, 1, 1) region contains three holes. 



A decomposition such as the one in Figure 6.29 would not be very useful for 
motion planning because of the holes in the regions. Further refinement will be 
needed for motion planning, which is fortunately produced by cylindrical algebraic 
decomposition. On the other hand, any semi-algebraic decomposition is quite 
useful for solving the decision problem. Only one point needs to be checked 
inside of each region to determine whether some Tarski sentence that has no free 
variables is true. Why? Observe that if the polynomial signs cannot change over 
some region, then the true /false value of the corresponding logical predicate, 
$ cannot change. Therefore, it sufficient only to check one point per sign- invariant 
region. 



6.4.2 Cylindrical Algebraic Decomposition 

Cylindrical algebraic decomposition is a general method that produces a cylin- 
drical decomposition in the same sense considered in Section 6.3.2 for polygons 
in R 2 , and also the decomposition in Section 6.3.4 for the line-segment robot. 
It is sometimes referred to as Collins decomposition after its original developer 
[24, 168, 169]. In fact, the decomposition in Figure 6.18 can be considered as a 
cylindrical algebraic decomposition for a semi-algebraic set in which every geo- 
metric primitive is a linear polynomial. In this section, such a decomposition is 
generalized to any semi-algebraic set in W l . 

The idea is to develop a sequence of projections that drops the dimension 
of the semi-algebraic set by one each time. Initially, the set is defined over M n , 
and after one projection, a semi-algebraic set is obtained in R"" 1 . Eventually, 
the projection reaches R, and a univariate polynomial is obtained for which the 
zeros are at the critical places where cell boundaries need to be formed. A cell 
decomposition of 1-cells (intervals) and 0-cells is formed by partitioning R. The 
sequence is then reversed, and decompositions are formed from M 2 up to M. n . Each 
iteration starts with a cell decomposition in W and lifts it to obtain a cylinder of 
cells in W +1 . Figure 6.34 shows how the decomposition looks for the gingerbread 
example; since n — 2, it only involved one projection and one lifting. 
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Semi-algebraic projections are semi-algebraic The following is implied by 
the Tarski-Seidenberg Theorem [58]: 

A projection of a semi- algebraic set from dimension n to dimension n — 1 is a 
semi-algebraic set 

This gives a kind of closure of semi-algebraic sets under projection, which is re- 
quired to ensure that every projection of a semi- algebraic set in R* leads to a 
semi-algebraic set in R l_1 . This property is actually not true for (real) algebraic 
varieties, which were introduced in Section 4.4.1. These are defined using only 
the = relation, and are not closed under the projection operation. Therefore, it 
is a good thing (not just a coincidence!) that we are using semi- algebraic sets. 

Real algebraic numbers As stated previously, the sequence of projections ends 
with a univariate polynomial over R. The sides of the cells will be defined based 
on the precise location of roots of this polynomial. Furthermore, representing a 
sample point for a cell of dimension k in a complex in R n for k < n, will require 
perfect precision. If the coordinates are slightly off, the point will lie in a different 
cell. This raises the complicated issue of how these roots are represented and 
manipulated in a computer. 

For univariate polynomials of degree 4 or less, formulas exist to compute all 
of the roots in terms of functions of square roots and higher-order roots. From 
Galois theory [351, 611], it is known that such formulas and nice expressions for 
roots do not exist for higher-degree polynomials, which can certainly arise in the 
complicated semi-algebraic models formulated in motion planning. The roots in R 
could be any real number, and many real numbers require infinite representations. 

One way of avoiding this mess is to assume that only polynomials in Q[x±, . . . , x n ] 
will be used, instead of the more general R[xi, . . . ,x n ]. The field Q is not alge- 
braically closed because zeros the polynomial lie outside of Q n . For example, if 
f( Xl ) =xj-2, then / = for x x = ±V%, and V2 £ Q. However, some elements 
of R can never be a root of a polynomial in Q[xi, . . . , x n ]. 

The set, A, of all real roots to all polynomials in Q[x] is called the set of 
real algebraic numbers. The set A C R of actually represents a field (recall from 
Section 4.4.1). Several nice algorithmic properties of the numbers in A are: 1) 
they all have finite representations, 2) addition and multiplication operations on 
elements of A can be computed in polynomial time, and 3) conversions between 
different representations of real algebraic numbers can be performed in polynomial 
time. This means that all operations can be done without resorting to some 
kind of numerical approximation. In some applications, such approximations are 
fine; however, for algebraic decompositions, they destroy critical information by 
potentially confusing roots (e.g., how can we know for sure whether a polynomial 
has multiple roots, or just two roots that are very close together?). 

The details are not presented here, but there are several methods for represent- 
ing real algebraic numbers and corresponding algorithms for manipulating them 
efficiently The running time of cylindrical algebraic decomposition ultimately 
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depends on this representation. In practice, a numerical root finding method that 
has a precision parameter, e, can be used by choosing e small enough to ensure 
that roots will not be confused. A sufficiently small value can be determined by 
applying gap theorems, which give lower bounds on the amount of real root separa- 
tion, expressed in terms of the polynomial coefficients [123]. Some methods avoid 
requiring a precision parameter. One well-known example is the derivation of a 
Sturm sequence of polynomials based on the given polynomial. The polynomials 
in the Sturm sequence are then used to find isolating intervals for each of the roots 
[58]. The polynomial, together with its isolating interval, can be considered as 
an example root representation. Algebraic operations can even be formed using 
this representation in time 0(d lg d), in which d is the degree of the polynomial 
[676]. See [58, 123, 676] for detailed presentations on the exact representation and 
calculation with real algebraic numbers. 

One-dimensional decomposition To explain the method, we first perform 
a semi-algebraic decomposition of R, which is the final step in the projection 
sequence. Once this is explained, then the multi-dimensional case will follow 
more easily. 

Let T be a set of m univariate polynomials 

?={fi eQN \i = l,...,m}, (6-18) 

that are used to define some semi-algebraic set in R. The polynomials in T could 
come directly from a quantifier-free formula <fi (which could even appear inside of 
a Tarski sentence, as in (6.9)). 

Define a single polynomial / as / = YYiLifi- Suppose that / has k distinct, 
real roots, which are sorted in increasing order: 

-oo < ft < & < ••• < Pi-i < Pi < Pi+i < ••• < Pk < oo. (6.19) 

The one-dimensional semi-algebraic decomposition is given by the following 
sequence of alternating 1-cells and O-cells: 

(-oo,A), [Pi, Pi], (Pi,P 2 ), (Pi-l,Pi), [Pi, Pi], (Pi,Pi + l), [Pk,Pk], (Pk,oo). 

(6.20) 

Any semi-algebraic set can be expressed using the polynomials in T can be ex- 
pressed as the union some of the O-cells and 1-cells given in (6.20). This can also 
be considered as a singular complex (it can even be considered as a simplicial 
complex, but this will not be true in higher dimensions). 

Sample points can be generated for each of the cells as follows. For the un- 
bounded cells, [— oo,P\) and (/3k, oo], valid samples are Pi — 1 and Pk + 1, re- 
spectively. For each finite 1-cell, (Pi,p i+1 ), the midpoint (Pi + p i+1 )/2 produces a 
sample point. For each 0-cell, [Pi, Pi], the only choice is to use Pi as the sample 
point. 
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Figure 6.30: Two parabolas are used to define the semi-algebraic set [1,2]. 
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Figure 6.31: A semi-algebraic decomposition for the polynomials in Figure 6.30. 

Example 6.4.4 Figure 6.30 shows a semi- algebraic subset of R that is defined 
by two polynomials, fi(x) = x 2 — 2x and f2(x) = x 2 — 4x + 3. Thus, T = {fi, ^2}- 
Consider quantifier-free formula 

(j){x) = {x 2 - 2x > 0) A {x 2 - 4x + 3 > 0) (6.21) 

The semi-algebraic decomposition into 5 1-cells and 4 0-cells is shown in Figure 
6.31. Note that each cell is sign- invariant. The sample points for the 1-cells are 
— 1, 1/2, 3/2, 5/2, and 4, respectively. The sample points for the 0-cells are 0, 1, 
2, and 3, respectively. 

A decision problem can be nicely solved using the decomposition. Suppose 
a Tarski sentence that uses the polynomials in T has been given. Here is one 
possibility: 

$ = 3x[(x 2 - 2x > 0) A {x 2 - Ax + 3 = 0)] (6.22) 

The sample points alone are sufficient to determine whether $ is true or false . 
Once x — 1 is attempted, it is discovered that $ is true . The quantifier elimi- 
nation problem cannot yet be considered because more dimensions are needed. ■ 



The inductive step to higher dimensions Now consider constructing a cylin- 
drical (semi-) algebraic decomposition for R n . Figure 6.34 shows an example for 
R 2 . First consider how to iteratively project the polynomials down to R to ensure 
that when the decomposition of R n is constructed, the sign invariant property is 
maintained. It will also be the case that the resulting decomposition will corresond 
directly to a singular complex. 
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Folding over 



Intersection 



Figure 6.32: Critical points occur either when the surface folds over in the vertical 
direction or when surfaces intersect. 

Let T n denote the original set of polynomials in Q[x\, . . . , x n ] that are used to 
define the semi-algebraic set (or Tarski sentence) in M. n . Form a single polynomial 
/ = TYiLi fi- Let /' = df /dx n , which is also a polynomial. Let g = GCD(f, /'), 
which is the greatest common divisor of / and /'. The set of zeros of g are all points 
which are both zeros of both / and /'. Being a zero of /' means that the surface 
given by / = does not vary locally when perturbing x n . These are places where 
a cell boundary needs to be formed because the surface may fold over itself in 
the x n direction, which is not permitted for a cylindrical decomposition. Another 
place where a cell boundary needs to be formed is at the intersection of two or 
more polynomials in JF„. The projection technique from W 1 to M n_1 generates 
a set, J- n -i, of polynomials in Q[xi, . . . ,x n -i], that satisfy these requirements. 
The polynomials T n -\ have the property that at least one contains a zero point 
below every point in x G W 1 for which f(x) = and f'(x) = 0, or polynomials 
in J-'n intersect. The projection method that constructs T n -\ involves computing 
principle subresultant coefficients, which are covered in [58, 677]. Resultants, of 
which the subresultants are an extension, are covered in [178]. 

The polynomials in T n -\ are then projected to M™ -2 to obtain T n -i- This 
process continues until T\ is obtained, which is a set of polynomials in Q[xi]. A 
one-dimensional decomposition is formed, as defined earlier. From T\, a single 
polynomial is formed by taking the product, and H. is partitioned into 0-cells and 
1-cells. We next describe the process of lifting a decomposition over W~ l up to 
W. This technique is applied iteratively until M n is reached. 

Assume inductively that a cylindrical algebraic decomposition has been com- 
puted for a set of polynomials Ti-\ in Q[xi, . . . , The decomposition consists 
of fc-cells for which < k < i. Let p = (x±, . . . , G R 1-1 . For each one of the 
fc-cells, Cj_i, a cylinder over Cj_i is defined as the (k + l)-dimensional set 



{(p, Xi ) eW \peCi-x} 



(6.23) 
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The cylinder is sliced into a strip of /c-dimensional and k + 1 -dimensional cells by 
using polynomials in T\. Let f) denote one of the i slicing polynomials in the 
cylinder, sorted in increasing Xj order as /i, /2, . . ., fj, fj+i, • • •, fe- The following 
kinds of cells are produced (see Figure 6.33): 

Lower unbounded sector: 

{(p, Xi) G R l | p G Ci-i and a* < /i(p) } (6.24) 

Section: 

{(p,x*) ER l \ P e Ci_i and x, = /;(p) } (6.25) 

Bounded sector: 

{(p, Xj) G | p G Ci_i and /,(p) < x, < / J+1 (p) } (6.26) 

Upper unbounded sector: 

{(p, Xi ) eW\pe Ci_i and £(p) < x, }. (6.27) 

There is one degenerate possibility in which there are no slicing polynomials, and 
the cylinder over Cj_i can be extended into one unbounded cell. In general, the 
sample points are computed by picking a point in p G Cj_i and making a vertical 
column of samples of the form (p, Xj). A polynomial in Q[xj] can be generated, 
and the samples are placed using the same assignment technique as used for the 
one-dimensional decomposition. 

Example 6.4.5 (Mutilating the gingerbread face) Figure 6.34 shows a cylin- 
drical algebraic decomposition of the gingerbread face. It can be seen that the 
resulting complex is very similar to that obtained in Figure 6.18. ■ 

It is important to note that the cells do not necessarily project onto a rect- 
angular set, as in the case of the higher-dimensional vertical decomposition. For 
example, a generic n-cell, C n , for a decomposition of W 1 is described as the open 
set of (xi, . . . , x n ) G W 1 such that 

• C < x n < C' for some 0-cells C , C' G K which are roots of some /, /' G T\. 

• (x n _i,x n ) lies between C\ and C[ for some 1-cells C±, C[ which are zeros of 
some /, /' G Ti. 



• (x n -i, . . . ,x n ) lies between Cj_i and C' i _ 1 for some i-cells Cj_i, C' i _ 1 which 
are zeros of some /, /' G Ti. 
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Figure 6.33: A cylinder over every fc-cell Cj_i is formed. A sequence of poly- 
nomials, fx, fi, slices the cylinder into fc-dimensional sections and (k + 1)- 
dimensional sectors. 



• \Xx ; •-. ; %n 

) lies between C n _i and C' n _x for some (n — l)-cells C n _i, C' n _ 1 
which are zeros of some /, /' € T n . 

The resulting decomposition is sign- invariant, which allows the decision and 
quantifier elimination problems to be solved in finite time. To solve a decision 
problem, the polynomials in jF n are evaluated at every sample point to determine 
whether one of them satisfies the Tarski sentence. To solve the quantifier elimina- 
tion problem, note that any semi-algebraic sets that can be constructed from T n 
can be defined as a union of some cells in the decomposition. For the given Tarski 
sentence, T n is formed from all polynomials that are mentioned in the sentence, 
and the cell decomposition is performed. Once obtained, the sign information is 
used to determine which cells need to included in the union. The resulting union 
of cells is designed to include only the points in W 1 at which the Tarski sentence 
is TRUE . 

Solving a motion planning problem The cylindrical algebraic decomposition 
is also capable of solving any motion planning problems formulated in Chapter 4. 
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35 <> 37 



Figure 6.34: A cylindrical algebraic decomposition of the gingerbread face. There 
are 37 2-cells, 64 1-cells, and 28 0-cells. The straight 1-cells are intervals of the 
vertical lines, and the curved ones are portions of the zero set of a polynomial in 
T . The decomposition of R is also shown. 



First assume that C = R n . Just as for other decompositions, a roadmap is formed 
in which every vertex is an n-cell, and edges connect every pair of adjacent n- 
cells by traveling through an (n — l)-cell. It is straightforward to determine 
adjacencies inside of a cylinder, but there are several technical details associated 
with determining adjacencies of cells from different cylinders [58] (pages 152-154 
present an example that illustrates the problem). The cells of dimension less than 
n — 1 are not needed for motion planning purposes (just as vertices were not 
needed for the vertical decomposition in Section 6.2.2). The query points, qi and 
q g are connected to the roadmap depending on the cell in which they lie, and a 
discrete search is performed. 

If C C R n and its dimension is k for k < n, then all of the interesting cells are of 
lower dimension. This occurs, for example, due to the constraints on the matrices 
to force them to lie in 5*0(2) or 5*0(3). This may also occur for problems from 
Section 4.4, in which closed chains reduce the degrees of freedom. The cylindrical 
algebraic decomposition method can still solve such problems; however, the exact 
root representation problem becomes more complicated when determining the cell 
adjacencies. A discussion of these issues appears in [676]. For the case of 50(2) 
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and 5*0(3), this complication can be avoided by using stereographic projection to 
map S 1 or § 3 to 1 or M 3 , respectively. This mapping removes a single point from 
each, but the connectivity of C/ ree remains unharmed. The antipodal identification 
problem for unit quaternions represented by S 3 also does not present a problem; 
there is a redundant copy of C, which does not affect the connectivity. 

The running time for cylindrical algebraic decomposition depends on many 
factors, but in general it is polynomial in the number of polynomials in JF„, polyno- 
mial in the maximum algebraic degree of the polynomials, and doubly-exponential 
in the dimension. Complexity issues will be covered in more detail in Section 6.5.3. 

6.4.3 Canny's Roadmap Algorithm 

The doubly-exponential running time of cylindrical algebraic decomposition in- 
spired researchers to do better. It has been shown that quantifier elimination 
requires doubly-exponential time [187]; however, motion planning is a different 
problem. Canny introduced a method that produces a roadmap directly from the 
semi-algebraic set, rather than constructing a cell decomposition along the way. 
Since there are doubly-exponentially many cells in the cylindrical algebraic de- 
composition, avoiding this construction pays off. The resulting roadmap method 
of Canny solves the motion planning problem in time that is again polynomial 
in the number of polynomials, polynomial in the algebraic degree, but is only 
singly-exponential in dimension [123]. 

Much like the other combinatorial motion planning approaches, it is based on 
finding critical curves and points. The main idea is to construct linear mappings 
from M n to M 2 that produce silhouette curves of the semi-algebraic sets. Perform- 
ing one such mapping on the original semi-algebraic set will yield a roadmap, but 
it might not preserve the original connectivity. Therefore, linear mappings from 
W 1 ^ 1 to M 2 are performed on some (n — l)-dimensional slices of the original semi- 
algebraic set to yield more roadmap curves. This process is applied recursively 
until the slices are already one-dimensional. The resulting roadmap is formed 
from the union of all the pieces obtained in the recursive calls. The resulting 
roadmap was shown to have the same connectivity as the original semi- algebraic 
set [123]. 

Suppose that C = W l . Let JF = {f\, . . . , f m } denote the set of polynomials 
that define the semi- algebraic set, which is assumed to be represented as a disjoint 
union of manifolds. Assume that each j { e Q[xi, . . . , x n ]. First, a small perturba- 
tion to the input polynomials T is performed to ensure that every sign-invariant 
set of W 1 is a manifold. This forces the polynomials into a kind of general po- 
sition, which can be achieved with probability one using random perturbations; 
there are also deterministic methods to solve this problem. The general position 
requirements on the input polynomials and the 2D projection directions are fairly 
strong, which has stimulated more recent work that eliminates many of the prob- 
lems [58]. From this point onward, it will be assumed that the polynomials are in 
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general position. 

Recall the sign assignment function from Section 6.4.1. Each sign-invariant 
set is a manifold because of the general position assumption. Canny's method 
computes a roadmap for any A;-dimensional manifold for k < n. Such a manifold 
will have precisely n — k signs that are (which means that points lie precisely 
on the zero sets of n — % polynomials in T). At least one of the signs must be 
0, which means that Canny's roadmap actually lies in dCf ree (this technically is 
not permitted, but nevertheless the algorithm correctly decides whether a solution 
path exists through Cf ree ). 

Recall that each fi is a function IR n — >• M.. Let x denote (x\, . . . , x n ) E M n . The 
k polynomials that have zero signs can be put together sequentially to produce a 
mapping ip : M n — > M k . The i th component of the vector ip(x) is ipi(x) = fi(x). 
This is closely related to the sign assignment function of Section 6.4.1, except that 
now the real value from each polynomial is directly used, rather than taking its 
sign. 

Now introduce a function, g : M n — > MP , in which either j = 1 or j = 2 (the 
general concepts presented below will work for other values of j, but 1 and 2 are 
the only values needed for Canny's method). The function g will serve the same 
purpose as a projection in cylindrical algebraic decomposition, but note that g 
immediately drops from dimension n to dimension 2 or 1, instead of dropping to 
n — 1 as in the case of cylindrical projections and liftings. 

Let h : M n — > M k+ i denote a mapping that constructed directly from ip and g 
as follows. For the i th component, if i < k, then hi(x) = ipi(x) = fi(x). Assume 



that k + j < n. If i > k, then hi 
h, at x be defined as 



X) 



gi-k(x). Let J x (h) denote the Jacobian of 



J x (h) 









(dh{x) 


dfi(x)\ 








dxi 


dx n 


/ dh x {x) 


dh\(x) 




df k {x) 


df k (x) 


! dx\ 


dx n 




dx\ 


dx n 


dh m+k (x) 


dh m+k (x] 




dgi(x) 


dgi(x) 


\ dxi 


dx n 


! 


dx\ 


dx n 








d gj {x) 


dgj{x) 








\ dx\ 


dx n / 



(6.28) 



A point x G M n at which J x {h) is singular is called a critical point. The matrix is 
defined to be singularii every (m+k) x (m+k) subdeterminant is zero. Each of the 
first k rows of J x (h) calculates the surface normal to fi(x) = 0. If these normals 
are not linearly independent of the directions given by the last j rows, then the 
matrix becomes singular. The following example from [119] nicely illustrates this 
principle. 



6.4. COMPUTATIONAL ALGEBRAIC GEOMETRY 



293 



Example 6.4.6 Let n — 3, k — 1, and j = 1. The zeros of a single polynomial 
fi define a two-dimensional subset of M. 3 . Let fi be the unit sphere, S 2 , defined 
as the zeros of the polynomial 

fi(x 1 ,x 2 ,x 3 ) = x\ + xj + xl - 1. (6.29) 

Suppose that g : R 3 — > K. is defined as <?(:ri, x 2 , x 3 ) = xi. The Jacobian (6.28) 
becomes 

' 2 T t ^ 3 ), (6.30) 

and is singular when all 3 of the possible 2x2 subdeterminants are zero. This 
occurs if and only if x 2 = £3 = 0. This yields the critical points (—1,0,0) and 
(1,0,0) on S> 2 . Note that this is precisely when the surface normals of § 2 are 
parallel to the vector [10 0]. 

Now suppose that j = 2 to obtain g : R 3 — + M 2 , which is defined as g(x±, x 2 , X3) = 
(xi,x 2 ). In this case, (6.28) becomes 



2xi 


2x 2 


2x 3 \ 









, 


(6.31) 





1 


/ 





which is singular if and only if X3 = 0. The critical points are therefore the X\X 2 - 
plane intersected with S 3 , which yields the equator points (all {xi,x 2 ) G M 2 such 
that x\ + x\ = 1). In this case, more points are generated because the matrix 
becomes degenerate for any surface normal of S 2 that is parallel to [1 0] , [0 1 0] , 
or any linear combination of them. ■ 



The first mapping in Example 6.4.6 yielded two isolated critical points, and the 
second mapping yielded a one-dimensional set of critical points, which is referred 
to as a silhouette. The union of the silhouette and the isolated critical points 
yields a roadmap for S 2 . Now consider generalizing this example to obtain the 
full algorithm for general n and k. A linear mapping g : R n — > M 2 is constructed, 
which might not be axis-aligned as in Example 6.4.6 because it must be chosen in 
general position (otherwise degeneracies might arise in the roadmap). Define ip to 
be the set of polynomials that become zero on the desired manifold on which to 
construct a roadmap. Form the matrix (6.28), and determine the silhouette. This 
is accomplished in general using subresultant techniques which were also needed 
for cylindrical algebraic decomposition; see [58, 123] for details. Let g\ denote the 
first component of g, which yields a mapping g 1 : M. n — > R. Forming (6.28) using 
#1 yields a finite set of critical points. Taking the union of the critical points and 
the silhouette produces part of the roadmap. 

So far, however, there are no guarantees that the connectivity is preserved. To 
handle this problem, the Canny's algorithm proceeds recursively. For each of the 
critical points, x G M n , an n — 1-dimensional hyperplane through x is chosen for 
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x 1 

Figure 6.35: Suppose that the semi-algebraic set is a solid torus in R 3 . 

which the g\ row of (6.28) is the normal (hence it is perpendicular in some sense to 
the flow of </i). Inside of this hyperplane, a new g mapping is formed. This time 
a new direction is chosen, and the mapping takes the form g : R" -1 — > R 2 . Once 
again, the silhouettes and critical points are founded and added to the roadmap. 
This process is repeated recursively until the base case in which the silhouettes 
and critical points are directly obtained without forming g. 

It is helpful now to consider an example. Since the method involves a sequence 
of 2D projections, it is difficult to visualize. Examples in R 4 and higher involve 
more than one of the 2D projections. An example over R 3 is presented here; see 
[123] for another example over R 3 . 

Example 6.4.7 (The solid torus in R 3 ) Consider three-dimensional algebraic 
set shown in Figure 6.35. After defining the mapping g(x±, £3) = (rci,£ 2 ), the 
roadmap shown in Figure 6.36 is obtained. The silhouette are obtained from g, 
and the critical points are obtained from g\. Note that the original connectivity of 
the solid torus is not preserved because the inner ring does not connect to the outer 
ring. This illustrates the need to also compute the roadmap for lower-dimensional 
slices. For each of the four critical points, the critical curves are computed for a 
plane that is parallel to the X 2 X 3 plane, and for which the x\ position is deter- 
mined by the critical point. The slice for one of the inner critical points is shown 
in Figure 6.37. In this case, the slice already has two dimensions. New silhouette 
curves are added to the roadmap to obtain the final result shown in Figure 6.38. ■ 

To solve a planning problem, the query points qi and q g are artificially declared 
to be critical points in the top level of recursion. This forces the algorithm to 
generate curves that connect them to the rest of the roadmap. 

The completeness of the method requires very careful analysis, which is thor- 
oughly covered in [58, 123]. The main elements to the analysis are: 1) showing 
that the polynomials can be perturbed and g can be chosen to ensure general po- 
sition, 2) the singularity conditions on (6.28) lead to algebraic sets (varieties), and 
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Figure 6.36: The projection into the X\X 2 plane yields silhouettes for the inner 
and outer rings, and also four critical points. 




Figure 6.37: A slice taken for the inner critical points is parallel to the X 2 X 3 
plane. The roadmap for the slice connects to the silhouettes from Figure 6.36, 
which preserves the connectivity of the original set in Figure 6.35. 




Figure 6.38: All of the silhouettes and critical points are merged to obtain the 
roadmap. 
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3) the resulting roadmap has the required properties mentioned in Section 6.1 of 
being accessible and connectivity-preserving for Cf ree (actually it is shown for the 
dCf ree ). The method above explained how to compute the roadmap for each sign- 
invariant set, but to obtain a roadmap for the planning problem, the roadmaps 
from each sign-invariant set must be connected together correctlyl fortunately, 
this was established. See the Linking Lemma of [119]. 

6.5 Complexity of Motion Planning 

This section summarizes some theoretical work that characterizes the complex- 
ity of motion planning problems. Note this not equivalent to characterizing the 
running time of particular algorithms. The existence of an algorithm serves as an 
upper bound on the problem difficulty because it is a proof by example that solving 
the problem requires no more time than what is needed by the algorithm. On the 
other hand, lower bounds are also very useful because they give an indication of the 
difficulty of the problem itself. Suppose, for example, you are given an algorithm 
that solves a problem in time 0(n 2 ). Does it make sense to try to find a more 
efficient algorithm? Does it make sense to try to find a general-purpose motion 
algorithm that runs in time that is polynomial in the dimension? Lower bounds 
provide answers to questions such as this. Usually lower bounds are obtained by 
concocting bizarre, complicated examples that are allowed by the problem defini- 
tion, but probably not considered by the person who first formulated the problem. 
In this line of research, progress is made by either raising the lower bound (unless 
it is already tight), or by showing that a narrower version of the problem is still 
allows such bizarre examples. The latter case occurs often in motion planning. 

6.5.1 Lower Bounds 

Lower bounds have been established for a variety of motion planning problems, 
and also a wide variety of planning problems in general. To interpret these bounds 
a basic understanding of the theory of computation is required [339, 711]. This 
fascinating subject will be unjustly summarized in a few paragraphs. A problem is 
a set of instances that each are carefully encoded as a binary string. An algorithm 
is formally considered as a Turing machine, which is a finite-state machine that 
can read and write bits to an unbounded piece of tape. Algorithms are usually 
formulated to make a binary output, which involves accepting or rejecting a prob- 
lem instance that is initially written to the tape and given to the algorithm. In 
motion planning, this amounts to deciding whether or not a solution path exists 
for a given problem instance. 

Languages A language is a set of binary strings associated with a problem. 
It represents the complete set of instances of a problem. An algorithm is said to 
decide a language if in finite time it correctly report accepts all strings that belong 
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Figure 6.39: It is known that P C EXPTIME is a strict subset; however, it is 
not known precise how large NP and PS PACE are. 

to it, and rejects all others. The interesting question is how much time or space is 
required to decide a language? This question is asked of the problem, under the 
assumption that the best possible algorithm would be used to decide it. (We can 
easily think of inefficient algorithms that waste resources.) 

A complexity class is a set of languages that can all be decided within some 
specified resource bound. The class P is the set of all languages (and hence 
problems) for which a polynomial-time algorithm exists (i.e., the algorithm runs 
in time 0(n k ) for some integer k). By definition, an algorithm is called efficient 
if it decides its associated language in polynomial time. 7 If no efficient algorithm 
exists, then the problem is called intractable. The relationship between several 
other classes that often emerge in theoretical motion planning is shown in Figure 
??. The class NP is the set of languages that can be solved in polynomial time by a 
nondeterministic Turing machine. Some discussion of nondeterministic machines 
appears in Section ??. Intuitively, it means that solutions can be verified in 
polynomial time because the machine magically knows which choices to make 
while trying to make the decision. The class PSPACE is the set of languages that 
can be decided with no more than a polynomial amount of storage space during the 
execution of the algorithm (NPSPACE=PSPACE, so there is no nondeterministic 
version). The class EXPTIME is the set of languages that can be decided in time 
0{2 n ) for some integer k. It is known that EXPTIME is larger than P, but it 
is not known precisely there NP and PSPACE lie. It might be the case that P 
= NP = PSPACE (although hardly anyone believes this), or it could be that 
NP = PSPACE = EXPTIME. Because of this uncertainty, one cannot say that a 
problem is intractable if it is NP-hard or PSPACE-hard; one can, however, if the 
problem is EXPTIME-hard. One additional remark: it is convenient to remember 
that PSPACE-hard implies NP-hard. 

Hardness and completeness Since an easier class is included as a subset of 
a harder one, it is helpful to have a notion of a language (i.e., problem) being 



7 Note that this definition may be absurd in practice; an algorithm that runs in time 0(n 90125 ) 
would probably not be too efficient for most purposes. 
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Figure 6.40: Even motion planning for a bunch of translating rectangles inside of 
a rectangular box in IR 2 is PSPACE-hard. 



among the hardest possible within a class. Let X refer to either P, NP, PSPACE, 
or EXPTIME. A language A is called X-hard if every language, B, in class X is 
polynomial time reducible to A. In short, this means that in polynomial time, 
any language in B can be translated into instances for language A, and then the 
decisions for A can be correctly translated back in polynomial time to correctly 
decide B. Thus, if A can be decided, then within a polynomial-time factor, every 
language in X can be decided. The hardness concept can even be applied to 
a language (problem) that does not belong to the class. For example, we can 
declare that a language A is NP-hard even if A (jL NP (it could be harder, and lie 
in EXPTIME, for example). If it is known that the language is both hard for some 
class X and is also a member of X, then it is called X-complete (i.e., NP-complete, 
PSPACE-complete, etc.). 8 

Lower bounds for motion planning The general motion planning problem, 
Formulation 4.3.1, was shown in 1979 to be PSPACE-hard by Reif [651]. In fact, 
the problem was restricted to polyhedral obstacles and a finite number of polyhe- 
dral robot bodies attached by spherical joints. The coordinates of all polyhedra 
are assumed to be in Q (this enables a finite-length string encoding of the prob- 
lem instance. The proof introduces a fascinating motion planning instance that 
involves many attached, dangling robot parts that must work their way through a 

8 If you remember hearing that a planning problem is NP-something, but cannot remember 
whether it was NP-hard or NP-complete, then it is safe to say NP-hard because NP-complete 
implies NP-hard. This can similarly be said for other classes, such as PSPACE-complete vs. 
PSPACE-hard. 
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complicated system of tunnels, which together simulates the operation of a sym- 
metric Turing machine. Canny later established that the problem in Formulation 
4.3.1 (with rational polynomial coefficients) lies in PSPACE. Therefore, the gen- 
eral motion planning problem is PSPACE-complete. 

Many other lower bounds have been shown for a variety of planning problems. 
One famous example is the Warehouseman's problem shown in Figure 6.40. This 
problem involves a finite number of translating, axis-aligned rectangles in a rect- 
angular world. It was shown in [338] to be PSPACE-hard. This example is a 
beautiful illustration of how such a deceptively simple problem formulation can 
lead to such a high lower bound. More recently, it was even shown that planning 
for Sokoban, which is a warehouseman's problem on a discrete 2D grid, is also 
PSPACE hard [182]. Other general motion planning problems that were shown 
to be PSPACE-hard include motion planning for a chain of bodies in the plane 
[337, 370], and motion planning for a chain of bodies among polyhedral obstacles 
in M 3 . Many lower bounds have been established for a variety of of extensions 
and variations of the general motion planning problem. For example, in [122] it 
was established that a certain form of planning under uncertainty for a robot in a 
3D polyhedral environment is NEXPTIME-hard, which is harder than any of the 
classes shown in Figure 6.39; the hardest problems in this NEXPTIME hard are 
believed to require doubly-exponential time to solve. 

These lower-bound or hardness results depend significantly on the precise rep- 
resentation of the problem. For example, it is possible to make problems look eas- 
ier by making instance encodings that are exponentially longer than they should 
be. The running time or space required is expressed in terms of n, the input size. 
If the motion planning problem instances are encoded with exponentially more 
bits than necessary, then a language that belongs to P will be obtained. As long 
as the instance encoding is within a polynomial factor of the optimal encoding, 
then this bizarre behavior is avoided. Another important part of the representa- 
tion is to pay attention to how parameters in the problem formulation can vary. 
We can redefine motion planning to be all instances for which the dimension of 
C is never greater than 2 1000 . The number of dimensions is sufficiently large for 
virtually any application. The resulting language for this problem belongs to P 
because cylindrical algebraic decomposition and Canny's algorithm can solve any 
motion planning problem in polynomial time. Why? Because now the dimension 
parameter in the time complexity expressions can be replaced by 2 1000 , which is a 
constant. This formally implies that an efficient algorithm exists for any motion 
planning problem that we would ever care about. This implication has no practical 
value, however. Thus, be very careful when interpreting theoretical bounds. 

The lower bounds may appear discouraging. There are two general directions 
to go from here. One is weaken the requirements, and tolerate algorithms that 
yield some kind of resolution, dispersion, or probabilistic completeness. This ap- 
proach was taken in Chapter 5, and leads to many efficient algorithms. Another 
direction is to define narrower problems that do not include the bizarre construe- 
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Figure 6.41: The lower envelope of a collection of functions. 

tions that led to bad lower bounds. For the narrower problems, it may be possible 
to design interesting, efficient algorithms. This approach was taken for the meth- 
ods in Sections 6.2 and 6.3. In Section 6.5.3, upper bounds for some algorithms 
that address these narrower problems will be presented, along with bounds for 
the general motion planning algorithms. Several of the upper bounds involve 
Davenport-Schinzel sequences, which are therefore covered next. 

6.5.2 Davenport-Schinzel Sequences 

Davenport-Schinzel sequences provide a powerful characterization of the structure 
that arises from the lower or upper envelope of a collection of functions. The lower 
envelope of five functions is depicted in Figure 6.41. Such envelopes arise in many 
problems throughout computational geometry, including many motion planning 
problems. They are an important part of the design and analysis of many modern 
algorithms, and the resulting algorithm time-complexity usually involves terms 
that follow directly from the sequences. Therefore, it is worthwhile to understand 
some of the basics before interpreting some of the results of Section 6.5.3. Much 
more information on Davenport-Schinzel sequences and their applications appears 
in [687]. The brief introduction presented here is based on [686]. 

For positive integers n and s, an (n, s) Davenport-Schinzel sequence is a se- 
quence («!,..., u m ) composed from a set of n symbols such that 

1. The same symbol may not appear consecutively in the sequence. In other 
words, Ui 7^ Ui + i for any i such that 1 < i < m. 

2. The sequence does not contain any alternating subsequence that uses two 
symbols and has length s + 2. A subsequence can be formed by deleting any 
elements in the original sequence. The condition can be expressed as there 
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does not exist s + 2 indices i\ < %2 < ■ ■ ■ < i s +2 for which -Uj = = Ui r = a 



and u 



Ui 



u 



Mi 



b, for some symbols a and b. 



As an example, an (n, 3) sequence cannot appear as (a • • • b ■ ■ ■ a ■ ■ ■ b ■ ■ ■ a), in 
which each ••• is filled in with any sequence of symbols. Let A s (n) denote the 
maximum possible length of an (n, s) Davenport-Schinzel sequence. 

The connection between Figure 6.41 can now be explained. Consider the 
sequence of function indices that visit the lower envelope. In the example, this 
sequence is (5, 2, 3, 4, 1). Suppose it is known that each pair of functions intersects 
in at most s places. If there are n real-valued continuous functions, then the 
sequence of function indices must be an (n, s) Davenport-Schinzel sequence. It is 
amazing that such sequences cannot be very long. For a fixed s, they are close to 
being linear. 

The standard bounds for Davenport-Schinzel sequences are [686] 9 : 
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= n 

= 2n- 1 
= 0(na(n)) 
= 6(n • 2 Q(n) ) 

< n ■ 2 a{n)s ~ 1+C2si - n) 

< n ■ 2 a (™) s ~ llga( ™) +c 2s+i (n ) 
= tt(n ■ 2^r a(n)3 ~ 1+CL(n) ). 



(6.32) 
(6.33) 
(6.34) 
(6.35) 
(6.36) 
(6.37) 
(6.38) 



In the expressions above C r (n) and C' r (n) are terms that are smaller than their 
leading exponents. The a(n) term is the inverse Ackerman function, which is 
an extremely slow-growing function that appears frequently in algorithms. The 
Ackerman function is defined as follows. Let Ai(m) = 2m and A n+ i(m) rep- 
resent m applications of A n . Thus, Ai(m) performs doubling, ^(m) performs 
exponentiation, and A 3 (m) performs tower exponentiation, which makes a stack 
of 2's, 



2 2 ' , (6.39) 

which has height m. The Ackerman junction is defined as A(n) = A n (n). This 
function grows so fast that A (4) is already an exponential tower of 2's that has 
height 65536. Thus, the inverse Ackerman function, a, grows very slowly. If n is 
less than or equal to an exponential tower of 65536 2's, then a(n) < 4. Even when 
it appears in exponents of the Davenport-Schinzel bounds, it does not represent 
a significant growth rate. 



9 The following asymptotic notion is used: 0(f(n)) denotes an upper bound, 0(/(n)) denotes 
a lower bound, and 0(/(n)) means that the bound it tight (both upper and lower). This notation 
is used in most algorithms books [176]. 
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Example 6.5.1 (Lower envelope of line segments) One interesting applica- 
tion of Davenport-Schinzel applications is to the lower envelope of a set of line 
segments in R 2 . Because segments in general position can intersect in at most one 
place, the number of edges in the lower envelope is 0(A 3 (n)) = Q(na(n)). There 
are actually arrangements of segments in R 2 that reach this bound; see [687]. ■ 



6.5.3 Upper Bounds 

The upper bounds for motion planning problems arise from the existence of com- 
plete algorithms that solve them. This section proceeds by starting with the most 
general bounds, which are based on the methods of Section 6.4, and concludes 
with bounds for simpler motion planning problems. 

General algorithms The first upper bound for the general motion planning 
problem of Formulation 4.3.1 came from the application of cylindrical algebraic 
decomposition [676]. Let n be the dimension of C. Let m be the number of 
polynomials in JF, which are used to define C & s . Recall from Section 4.3.3 how 
quickly this grows for simple examples. Let d be the maximum degree among the 
polynomials in T . The maximum degree of the resulting polynomials is bounded 
by 0(d 2 " ), and the total number of polynomials is bounded by 0((md) 3 " ). 
The total running time required to use cylindrical algebraic decomposition for 
motion planning is bounded by (md) ' 1 '". 10 Note that the algorithm is doubly- 
exponential in dimension, n, but polynomial in m and d. It can theoretically be 
declared to be efficient on a space of motion planning problems of bounded di- 
mension (although, it certainly is not efficient for motion planning in any practical 
sense) . 

Since the general problem is PSPACE-complete, it appears unavoidable that 
a complete, general motion planning algorithm will require a running time that 
is exponential in dimension. Since cylindrical algebraic decomposition is doubly- 
exponential, it led many in the 1980s to wonder whether whether this upper bound 
can be lowered. This was achieved by Canny's roadmap algorithm, for which 
the running time is bounded by m n (\gm)d 0(jl \ Hence, it is singly-exponential, 
which appears very close to optimal because it is up against the lower bound 
seems to be implied by PSPACE-hardness (and the fact that problems exist that 
require a roadmap with (md) n connected components [58]). Much of the algo- 
rithm complexity is due to finding a suitable deterministic perturbation to put 
the input polynomials into general position. A randomized algorithm can alter- 
natively be used, for which the randomized expected running time is bounded 
by m n (\gm)d 0(jL \ For a randomized algorithm [569], the randomized expected 
running time is still a worst-case upper bound, but averaged over random "coin 



10 It may seem odd for O(-) to appear in the middle of an expression. In this context, it means 
that there exists some c € [0, oo) such that the running time is bounded by (md) c . 
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tosses" that are introduced internally in the algorithm; it does not reflect any kind 
of average over the expected input distribution. Thus, these two bounds represent 
the best known upper bounds for the general motion planning problem. Canny's 
algorithm may also be applied to solve the kinematic closure problems of Section 
4.4, but the complexity does not reflect the fact that the dimension, k, of the 
algebraic variety is less than n, the dimension of C. A roadmap algorithm that 
is particularly suited for this problem is introduced in [57], and its running time 
is bounded by m k+1 d°^ n \ This serves as the best-known upper bound for the 
problems of Section 4.4. 

Specialized algorithms Now upper bounds are summarized for some narrower 
problems, which are easier to solve than the general problem. All of the problems 
involve either two or three degrees of freedom. Therefore, we expect that the 
bounds are much lower than those for the general problem. In many cases, the 
Davenport-Schinzel sequences of Section 6.5.2 arise. Most of the bounds presented 
here are based on algorithms that are not practical to implement; they mainly 
serve to indicate the best asymptotic performance that can be obtained for a 
problem. Most of the bounds mentioned here are included in [686]. 

Consider the problem from Section 6.2, in which the robot translates in W = 
M? and C bs is polygonal. Suppose that A is a convex polygon that has k edges, 
and O is the union of m disjoint, convex polygons with disjoint interiors, and 
their total number of edges n. In this case, the boundary of C/ ree (computed 
by Minkowski difference; see Section 4.3.2) will have at most 6m — 12 nonreflex 
vertices (interior angle less than n), and n + km reflex vertices (interior angle 
greater than tt). The free space, C/ ree can be decomposed and searched in time 
0((n + km)\g 2 n) [389, 686]. Using randomized algorithms, the bound reduces 
to 0((n + km) ■ 2°^ lgn) randomized expected time. Now suppose that A is a 
single nonconvex polygonal region described by k edges, and that O is a similar 
polygonal region described by n edges. The Minkowski difference could yield 
Q(k 2 n 2 ) edges for C b s . This can be avoided if the search is performed within 
a single connected component of Cf ree . Based on analysis that uses Davenport- 
Schinzel sequences, it can be shown that the worst connected component may have 
complexity Q(kna(k)), and the planning problem can be solved in time 0(kn lg 2 n) 
deterministically, or for a randomized algorithm, 0{kn ■ 2 a ( n Mgn) randomized 
expected time is needed. More generally, if C b. s consists of n algebraic curves 
in M. 2 , each with degree no more than d, then the motion planning problem for 
translation only can be solved deterministically in time 0(\ s+2 (n) lg 2 n), or with 
a randomized algorithm in 0(A s+2 (n) lgn) randomized expected time. In these 
expressions, \ s+2 (n) is the bound (6.37) obtained from the (n, s + 2) Davenport- 
Schinzel sequence, and s < d 2 . 

For the case of the line-segment robot of Section 6.3.4 in an obstacle region 
described with n edges, an 0(n 5 )-time algorithm was given. This is not the best 
possible running time for solving the line-segment problem, but the method is 
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easier to understand than others that are more efficient. In [589], a roadmap 
algorithm based on retraction is given that solves the problem in 0(n 2 lgnlg*n) 
time, in which lg* n is the number of times that lg has to be iterated on n to 
yield 1 (i.e., it is a very small, insignificant term; for practical purposes, you can 
imagine that the running time is 0(n 2 lgn)). The tightest known upper bound is 
0(n 2 lgn) [480]. It is established in [388] that there exist examples for which the 
solution path requires Q(n 2 ) length to encode. 

Now consider the case for which C = SE(2), and A is a convex polygon 
with k edges and O is a polygonal region described by n edges. The boundary 
of Cf ree has no more than 0(kn\ 6 (kn)) edges, and can be computed to solve the 
motion planning problem in 0(kn\Q(kn) lg kn) [5]. An algorithm that runs in time 
0{k i n\'i{n) lgn) and provides better clearance between the robot and obstacles is 
given in [151]. In [33] (some details also appear in [437]), an algorithm is presented, 
and even implemented, that solves the problem in time 0{k 3 n 3 \g(kn)), for the 
more general case in which A is nonconvex. The number of faces of C Q b s could 
be as high as Q(k 3 n 3 ) for this problem. By explicitly representing and searching 
only one connected component, the best- known upper bound for the problem is 
0((kn) 2+e ), in which e > may be chosen arbitrarily small [310]. 

In the final case, suppose that A translates in W = M 3 to yield C = M 3 . For a 
polyhedron or polyhedral region, let its complexity be the total number of faces, 
edges, and vertices. If A is a polyhedron with complexity k, and O is a polyhedral 
region with complexity n, then the boundary of C/ ree is polyhedral surface that 
has of complexity Q(k 3 n 3 ). As for other problems, if the search is restricted to 
a single component, then the complexity reduces. The motion planning problem 
in this case can be solved in time 0((kn) 2+e ) [25]. If A is convex, and there are 
m convex obstacles, then the best-known bound is 0(kmn\g 2 m) time. If more 
generally, C Q b s is bounded by n algebraic patches of constant maximum degree, 
then a vertical decomposition method can be used to solve the motion planning 
problem within a single connected component of Cf ree in time 0(n 2+e ). 
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Literature 

A nice collection of early papers appears in [679]; this includes [589, 590, 651, 675, 
676, 677]. 

A excellent reference for material in combinatorial algorithms, computational 
geometry, and complete algorithms for motion planning is the collection of survey 
papers in [292]. 

If you need more algebra background, try reading [178] and [611] before trying 
to tackle books such as [58] and [351]. 

Say why we did not follow Latombe's naming of roadmap vs. cell decomp. 
Since all cell decomposition methods produce a roadmap, they can be considered 
as a special class of roadmap algs. 

Exercises 

1. Extend the vertical decomposition algortihm to correctly handle the case 
in which C b s has two or more points that lie on the same vertical line. 
This includes the case of vertical segments. Random perturbations are not 
allowed. 

2. Describe in detail how to use the vertical decomposition and line-sweep idea 
to comput C bs in 0(n\gn) time. 

3. Propose a complete motion planning algorithm for a polygonal C b. s based on 
decomposing C a b s into triangles. What is the running time of your algorithm? 

4. Explain how to use the plane sweep idea to efficiently merge two nonconvex 
polygons. 

5. Extend vertical decomposition to work for circular arcs and line segments. 

6. Extend the bitangent graph algorithm to work for obstacle boundaries that 
are either pieces of circular arcs or line segments. 

7. Derive the Conchoid of Nicomedes equation for the segment robot. 

8. Make a resolution-complete version of the slicing/dscireting method for the 
line segment robot. 

9. Determine the cells for a line segment robot example. 

10. Make some ID decompositions to determine truth for a Tarski sentence with 
no free variables. Maybe a 2D example? 

11. Construct a cad for S 1 , § 2 , § 3 . (Give them cell numbers for the first two.) 
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12. Show using the matrix (6.28) that the Canny's roadmap for the torus, shown 
in Figure 6.38, is correct, (need to give torus equation) 

13. A semester project idea is to implement Canny's algorithm, (please tell me 
if you succeed) 



Chapter 7 

Extensions of Basic Motion 
Planning 



Chapter Status 




What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



This chapter presents many extensions and variations of the motion planning 
problem considered in Chapters 3 to 6. Each one of these can be considered 
as a "spin-off" that is fairly straightforward to describe using the mathematical 
concepts and algorithms introduced so far. Unlike previous chapters, there is not 
much continuity in Chapter 7. Each problem is treated independently; therefore, 
it is safe to jump to whatever sections in the chapter you find interesting without 
fear of missing important details. 

In many places throughout the chapter, a state space, X will arise. This 
is consistent with the general planning notation used throughout the book. In 
Chapter 4, the configuration space, C, was introduced, which can be considered 
as a special state space: it encodes the set of transformations that can be applied 
to a collection of bodies. Hence, Chapters 5 and 6 addressed planning in X = C. 
The configuration space alone will be insufficient for many of the problems in this 
chapter; therefore, X will be used because it is appears to be more general. For 
most cases in this chapter, however, X is derived from one or more configuration 
spaces. Thus, configuration space and state space terminology will be used in 
combination. 
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7.1 Time- Varying Problems 

This section brings time into the motion planning formulation. Although the robot 
has been allowed to move, it has been assumed so far that the obstacle region, O, 
and the goal configuration, q g e Cf ree are stationary for all time. It is now assumed 
that these entities may vary over time, although their motions are predictable. If 
the motions are not predictable, then some form of feedback is needed to respond 
to observations that are made during execution. Such problems are much more 
difficult, and will be handled in Chapters 8, 10, and elsewhere throughout the 
book. The current formulation is designed to allow the tools and concepts learned 
so far to be directly applied to generate path. 

7.1.1 Problem Formulation 

Let Tel denote the time interval, which may be bounded or unbounded. If T is 
bounded, then T = [0,t/], in which is the initial time, and tf is the final time. If 
T is unbounded, then T = [0, oo). An initial time other than could alternatively 
be defined without difficulty, but this will not be done here. 

Let the state space, X be defined as X = C x T, in which C is the usual 
configuration space of the robot, as defined in Chapter 4. A state, x, can be 
represented as x = (q, t), to indicate the configuration, q, and time, t, components 
of the state vector. The planning will occur directly in X, and in many ways it can 
be treated as any configuration space seen to far, but there is one critical difference: 
time marches forward. Imagine a path that travels through X. If it first reaches 
a state (q\, 5), and then later some state (q 2 , 3), then traveling backwards though 
time is required! There is no mathematical problem with allowing such time travel, 
but it is not realistic for most applications. Therefore, paths in X will be forced 
to follow a constraint that they must move forward in time. Such a constraint 
can be considered nonholonomic because it restricts the way the states can flow 
through X; this notion will be formally considered in much greater generality in 
Chapter 14. 

Now consider making time-varying versions of the items used in Formulation 
4.3.1 for motion planning: 

Formulation 7.1.1 (The Time- Varying Motion Planning Problem) 

1. A world, W, is defined, in which either W = R 2 or W = I 3 . This is the 
same as in Formulation 4.3.1. 

2. A time interval, T C K, is defined, which is either bounded to yield T — [0, t j] 
for some final time, tf > 0, or unbounded to yield T = [0, oo). 

3. A semi- algebraic, time- varying obstacle region 0(t) C Wis defined for every 
t e T. It is assumed that the obstacle region is a finite collection of rigid 
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bodies that undergoes continuous, time-dependent rigid body transforma- 
tions. 

4. The robot, A (or A\, A m for a linkage), and configuration space, C, 
definitions are the same as in Formulation 4.3.1. 

5. The state space, X, is defined as the Cartesian product, X = C x T, and a 
state, x G X may be denoted as x = (q, t) to denote the configuration, q, 
and time, t components. See Figure 7.1. The obstacle region, X obs , in state 
space is defined as 

X obs = {(q,t) G X I A(q) n 0(t) + 0}, (7.1) 

and Xf ree — X \ X obs . For a given t E T, slices of X obs and Xf ree are 
obtained. These are denoted as C obs (t) and Cf ree (t), respectively, in which 
(if A is one body) 

C obs (t) = {qeC\A(q)nO(t)^®}, (7.2) 

and C/ ree —C \ C obs . 

6. A state Xi G X/ ree is designated as the initial state, with the constraint that 
Xi = (<?i,0) for some qi G Cf ree (0). In other words, at the initial time the 
robot cannot be in collision. 

7. A subset X g C Aj ree is designated as the goal region. A typical definition 
is to pick some q g G C and let X g = {(q g ,t) G Xf ree \ t G T}, which means 
that the goal is stationary for all time. 

8. A complete algorithm must compute a continuous, time-monotonic path, 
t[0, 1] — > Xf ree such that r(0) = and r(l) G X 9 , or correctly report 
that such a path does not exist. To be time monotonia, we require that 
t\ < ti, which are obtained from (qi,ti) = t(si) and (^2,^2) = r ( s 2), for any 
Si, s 2 G [0, 1] such that Si < s 2 . 

Example 7.1.1 Figure 7.1 shows an example for a convex, polygonal robot, A 
that translates in W = M 2 . There is a single, convex, polygonal obstacle, O. 
The two of these together yield a convex, polygonal configuration space obsta- 
cle, C obs (t), which is shown for times t\, t 2 , and t 3 . The obstacle moves with a 
piecewise-linear motion model, which means that transformations applied to O are 
a piecewise-linear function of time. For example, let (x, y) be a fixed point on the 
obstacle. To be a linear motion model, this point must transform as (x+Cit, y+C2t) 
for some constants C\,Ci G M.. To be piecewise linear, it may change to a different 
linear motion at a finite number of critical times. Between these critical times, 
the motion must remain linear. There are two critical times in the example. If 
C obs (t) is polygonal, and a piecewise-linear motion model is used, then X obs will 
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Cfree(tl) Cf ree (t 2 ) Cf ree (t 3 ) 

Figure 7.1: A time- varying example with linear obstacle motion. 



be polyhedral, which is depicted in Figure 7.1. A stationary goal is also shown, 
which appears as a line that is parallel to the T axis. ■ 



In the general formulation, there are no additional constraints on r, which 
means that the robot motion model allows infinite acceleration and unbounded 
speed. The robot velocity may change instantaneously, but the path through C 
must always be continuous. These issues did not arise in Chapter 4 because there 
was no need to mention time. Now it becomes necessary. 1 



lr The infinite acceleration and unbounded speed assumptions may annoy those with mechanics 
and control background. In this case, assume that the present models approximate the case in 
which every body moves slowly, and the dynamics can be consequently neglected. If this is 
still not satisfying, then jump ahead to Chapters 13 to 15, where general nonlinear systems 
are considered. It is still helpful to consider the implications derived from the concepts in this 
chapter because the issues remain for more complicated problems that involve dynamics. 



7. 1 . TIME- VARYING PROBLEMS 



311 



7.1.2 Direct Solutions 

Sampling-based methods Many sampling-based methods can be adapted from 
C to X without much difficulty. The time-dependency of obstacle models must 
be taken into account when verifying that path segments are collision free; the 
techniques from Section 5.3.4 and be extended to handle this. One important 
concern is the metric for X. For some algorithms, it may be important to be use 
a pseudometric because symmetry is broken by time (going back in time is not as 
easy as going forward). 

For example, suppose that the configuration space, C is a metric space, (C,p). 
This metric can be extended across time to obtain a pseudometric, px as follows. 
For a pair of states, x = (q, t) and x' = (</, £'), let 



Using p x , several sampling-based methods will naturally work. For example, the 
rapidly-exploring dense trees from Section 5.5 can be adapted to X. Using p x 
for a single-tree approach will ensure that all path segments travel forward in 
time. Using bidirectional approaches is more difficult for time-varying problems, 
because X g is usually not a single point. It is not clear which (q,t) should be the 
starting vertex for the tree from the goal. The sampling-based roadmap methods 
of Section 5.6 are perhaps the most straightforward to adapt. The notion of a 
directed roadmap is needed, in which every edge must be directed to yield a time- 
monotonic path. For each pair of states, (q, t) and (q', t'), such that t ^ t', exactly 
one valid direction exists for making a potential edge. If t = t', then no edge can 
be attempted because it would require the robot to instantaneously "teleport" 
from one part of W to another. Because forward time progress is already taken 
into account by the directed edges, a symmetric metric may be preferable instead 
of (7.3) for the sampling-based roadmap approach. 

Combinatorial methods In some cases, combinatorial methods can be used 
to solve time- varying problems. If the motion model is algebraic (i.e., expressed 
with polynomials) then X a b s will be semi-algebraic. This enables the possibility of 
applying the general planners from Section 6.4, which are based on computational 
real algebraic geometry. The key issue once again is that the resulting roadmap 
must be directed with all edges being time monotonic. For Canny's method, 
this requirement seems difficult to ensure. Cylindrical algebraic decomposition is 
straightforward to adapt if T is chosen as the last variable to be considered in the 
sequence of projections. This will yield polynomials in Q[t], and R will be nicely 
partitioned into time intervals and time instances. Connections can then be made 
for a cell of one cylinder to an adjacent cell of a cylinder that occurs later in time. 

If X b s is polyhedral as depicted in Figure 7.1, then vertical decomposition can 
be used. It is best to first sweep the plane along the T axis, stopping at the critical 




(7.3) 
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T 



Figure 7.2: Transitivity is broken if the cells are not formed in cylinders over T. 
A time-monotonic path exists from C\ to C2, and from C2 to C3, but this does 
not imply that one exists from C\ to C3. 



times when the linear motion changes. This will yield nice sections which are 
further decomposed recursively, as explained in Section 6.3.3, and also facilitates 
the connection of adjacent cells to obtain time monotonic path segments. It is 
not too difficult to imagine the approach working for a four-dimensional state 
space, X, for which C b s (t) is polyhedral as in Section 6.3.3, and time adds the 
fourth dimension. Again, performing the first sweep with respect to the T axis is 
preferable. 

If X is not decomposed into cylindrical slices over each noncritical time inter- 
val, then cell decompositions may still be used, but one has to be more concerned 
about correctly connecting cells. Figure 7.2 illustrates the problem, for which 
transitivity among adjacent cells is broken. This complicates sample point selec- 
tion for the cells. 

Bounded speed There has been no consideration so far of the speed at which 
the robot must move to avoid obstacles. It is obviously impractical in many 
applications if the solution requires the robot to move arbitrarily fast. One step 
towards making a realistic model is to enforce a bound on the speed of the robot. 
(More steps towards realism are taken in Chapter 13.) For simplicity, suppose 
C = R 2 , which corresponds to a translating rigid robot, A, that moves in W = R 2 . 
A configuration, q G C can be represented as q = (y, z) (since x already refers to 
a state vector). The robot velocity can be expressed as v — (y,z) G R 2 , in which 
y = dy/dt and z = dz/dt. The robot speed is ||i>|| = \/y 2 + z 2 . A speed bound, b, 
is a positive constant, b G (0, 00), for which \\v \\ < b. 

In terms of Figure 7.1 this means that the slope of a solution path r must be 
constrained. Suppose that the domain of r is T = [0, tf] instead of [0,1]. This 
yields r : T — > X, and r(t) = (y,z,t). Using this representation, dr\/dt = y and 
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Y 




Figure 7.3: A projection of the cone constraint for the bounded speed problem. 

dr 2 /dt = z, in which Tj denotes the i th component of % (because it is a vector- 
valued function). Thus, it can seen that 6 constrains the slope of r(t) in X. To 
visualize this, imagine that only motion in the Y direction occurs and suppose 
6=1. If r holds the robot fixed, then the speed is zero, which satisfies any bound. 
If the robot moves at speed 1, then dri/dt = 1 and dr 2 jdt = 0, which satisfies 
the speed bound. In Figure 7.1 this generates a path that has slope 1 in the YT 
plane and is horizontal in the ZT plane. If both dr\jdt = dr 2 /dt = 1, then the 
bound is exceeded because the speed is V2. In general, the velocity vector at any 
state (y, z, t) points into a cone that starts at (y, z) and is aligned in the positive 
t direction; this is depicted in Figure 7.3. At time t + At, the state must stay 
within the cone, which means that 

[y(t + AT) - y{t)f + [z(t + At) - z{t)f < b\Atf. (7.4) 

This constraint makes it considerably more difficult to adapt the algorithms 
of Chapters 5 and 6. Even for piecewise-linear motions of the obstacles, the 
problem has been established to be PSPACE-hard [652, 653, 731], for W = M 2 . 
A complete algorithm that builds a kind of visibility graph is presented in [653]. 
The sampling-based roadmap of Section 5.6 is perhaps one of the easiest of the 
sampling-based algorithms to a adapt for this problem. The neighbors of point 
q, which are determined for attempted connections, must lie within the cone that 
represents the speed bound. If this constraint is enforced, a dispersion-complete 
or probabilistically-complete planning algorithm results. 

7.1.3 The Velocity Tuning Method 

An alternative to defining the problem in C x T is to decouple it into a path plan- 
ning part and a motion timing part. Algorithms based on this method cannot be 
complete, but velocity tuning is an important idea that can be applied elsewhere. 
Suppose there are both stationary obstacles and moving obstacles. For the sta- 
tionary obstacles, suppose that some path r : [0, 1] — > Cf ree has been computed 
using any of the techniques in Chapters 5 and 6. 
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T 
b. 



Figure 7.4: An illustration of path tuning: a) If the robot follows its computed 
path, it may collide with the moving obstacle; b) the resulting state space. 



The timing part is then handled in a second phase. This amounts to designing 
a timing function, o : T — > [0, 1] that indicates for time t, the location of the robot 
along the path, r. This achieved by defining the composition = r o a, which 
maps from T to Cf ree via [0, 1]. Thus, : T — > Cf ree . The configuration at time 
t G T may be expressed as <f>(t) = r(cr(£)). 

A two-dimensional state space can be defined as shown in Figure 7.4. The 
purpose is to convert the design of a (and consequently 0) into a familiar planning 
problem. The robot must move along its path from r(0) to r(l), an obstacle O 
moves along its path over the time interval T. Let S = [0, 1] denote the domain of 
r. A state space, X = T x S is shown, in which each point (t, s) means indicates 
the time t G T, and the position along the path, s G [0, 1]. See Figure 7.4. b. An 
obstacle region X obs is defined as 

X obs = {(t,s) G X | A(t(s)) n 0(t) 0}. (7.5) 

Once again, Xf ree is defined as Xf ree — X \ X b s . The task is to find a continuous 
path g : [0, 1] — * Xf ree . If g is time monotonic, then a position s G 5 is assigned 
for every time, t G T. These assignments can be nicely organized into a function, 
a : T — > S 1 , from which is obtained by = r o a to determine where the 
robot will be at each time. Being time monotonic in this context means that the 
path must always progress from left to right in Figure 7.4. b. It can, however, 
be nonmonotonic in the S direction. This corresponds to moving back and forth 
along r, causing some configurations to be revisited. 

Any of the methods described in Formulation 7.1.1 can be applied here. The 
dimension of X in this case is always two. Note that X Q b s is polygonal if the paths 
taken by A and O are both piecewise linear, and both are polygonal regions. In 
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Figure 7.5: Vertical decomposition can solve the path tuning problem. Note that 
this example is not in general position because vertical edges exist. The goal is 
to reach the green line at the top, which can be accomplished from any adjacent 
2-cell. For this example, it may even be accomplished from the first 2-cell if the 
robot is able to move quickly enough. 

this case, the vertical decomposition method of Section 6.2.2 can be applied by 
sweeping along the time axis to yield a complete algorithm (after having commit- 
ted to r, but it is not complete for Formulation 7.1.1). The result is shown in 
Figure 7.5. The cells are connected only if it is possible to reach one from the other 
by traveling in the forward time direction. As an example of a sampling-based 
approach, which may be suitable when X b s is not polygonal, is to place a grid 
over X and apply one of the classical search algorithms described in Section 5.4.2. 
Once again, only path segments in X that move forward in time are allowed. 



This section supposes that there are multiple robots that share the same world, 
W. A path must be computed for each one that avoids collisions with obstacles 
and with other robots. In Chapter 4, each robot could be a rigid body, A, or 
be made of k attached bodies, Ai, ■ ■ ., Ak- To avoid confusion, superscripts will 
be used in this section to denote different robots. The i th robot will be denoted 
by A 1 . Suppose there are m robots, A 1 , A 2 , . . ., A m . Each robot, A\ has its 
associated configuration space, C\ and its initial and goal configurations, q\ nit and 

Qgoal ' 
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7.2.1 Problem Formulation 

A state space can be denned that considers the configurations of all of the robots 
simultaneously, 

X = C 1 x C 2 x ••• x C m . (7.6) 

A state x G X specifies all robot configurations, and may be expressed as x = 
(q 1 , q 2 , . . . , q m ). Let N denote the dimension of X, which is given by Y1T=\ dim{C % ). 

There are two sources of obstacle regions in the state space: 1) robot- obstacle 
collisions, and 2) robot-robot collisions. For each i such that 1 < i < to, the 
subset of X that corresponds to robot A' 1 in collision with the obstacle region, O, 
is defined as 

X l obs = {xeX\A l (q l )nO^<&}. (7.7) 

This models the robot-obstacle collisions. 

For each pair, A % and A* , of robots, the subset of X that corresponds to A 1 
in collision with A^ is given by 

x%, = {xeX\A i tf)n (7.8) 

Both (7.7) and (7.8) will be combined in (7.10) to yield X obs . 

Formulation 7.2.1 (Multiple-Robot Motion Planning) 

1. The world, W and obstacle region, O are the same as in Formulation 4.3.1. 

2. There are to robots, A 1 , . . ., A m , which each may consist of one or more 
moving bodies. 

3. Each robot, A 1 , for 1 < i < to has an associated configuration space, C l . 

4. The state space, X, is defined as the Cartesian product 

X = C 1 x C 2 x ••• x C m . (7.9) 
The obstacle region in X is 




*<*.= \{J X o b s) U - (7-10) 



\i=l 

in which X l obs and X l J bs are the robot-obstacle and robot-robot collision states 
from (7.7) and (7.8), respectively. 

5. A state Xi G Xf ree is designated as the initial state, in which X{ = (g/, . . . , qi m ). 
For each i such that 1 < i < to, q? specifies the initial configuration of A 1 . 

6. A subset x g G Xf ree is designated as the goal state, in which x g = (g^ 1 , . . . , q g m ). 

7. The task is to compute a continuous path, r : [0,1] — > Xf ree such that 
r(0) = x init and r(l) G x goai . 
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An ordinary motion planning problem? On the surface it may appear that 
there is nothing unusual about the multiple robot problem because the formu- 
lations used in Chapter 4 already cover the case in which the robot consists of 
multiple bodies. They do not have to be attached; therefore, X can be considered 
as an ordinary configuration space. The planning algorithms of Chapters 5 and 6 
may be applied without adaptation. The main concern, however, is that the di- 
mension of X grows linearly in the number of robots. For example, if there are 12 
rigid bodies for which each has C l = SE(3), then the dimension of X is 6 • 12 = 72. 
Complete algorithms require time that is at least exponential in dimension, which 
makes them unlikely candidates for such problems. Sampling-based algorithms 
are more likely to scale well in practice when there many robots, but the resulting 
dimension might still be too high. 

Reasons to study multi-robot motion planning In spite of the fact multiple- 
robot motion planning can be handled like any other motion planning problem, 
there are several reasons to study it separately: 

1. The motions of the robots may be decoupled in many interesting ways. 
These leads to several interesting methods that first develop some kind of 
partial plan for the robots independently, and then consider the plan inter- 
actions to produce a solution. This idea is referred to as decoupled planning. 

2. The part of X b s due to robot-robot collisions has a cylindrical structure, 
depicted in Figure 7.6, which can be exploited by planning algorithms to 
make them more efficient. Each X l J bs defined by (7.8) depends only on two 
robots. A point, x = (q 1 , . . . ,q N ), is in X a b s if there exists i,j such that 
I < i,j < m such that A l (q % ) fl A j (q j ) ^ 0, regardless of the configurations 
of the other m — 2 robots. For some decoupled methods, this even implies 
that X iy S can be completely characterized by 2D projections, as depicted in 
Figure 7.10. 

3. If optimality is important, then a unique set of issues arises for the case 
of multiple robots. It is not a standard optimization problem because the 
performance of each robot has to be optimized. There is no clear way to 
combine these objectives into a single optimization problem without los- 
ing some critical information. It will be explained in Section 7.7.2 that 
Pareto optimality naturally arises as the appropriate notion of optimality 
for multiple-robot motion planning. 

Assembly Planning One important variant of multiple-robot motion planning 
is called assembly planning [132, 309, 330, 336, 403, 775, 776]. In automated man- 
ufacturing, many complicated objects are assembled step-by-step from individual 
parts. It is convenient for robots to manipulate the parts one-by-one to insert 
them into the proper locations (see Section 7.3.2). Imagine a collection of parts, 
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Figure 7.7: A collection of pieces used to define the assembly planning problem 
in Figure 7.8. 



each of which is interpreted as a robot, as shown in Figure 7.7. The goal is to 
assemble the parts into one coherent object, such as that shown in Figure 7.8. The 
problem is generally approached by starting with the goal configuration, which is 
tightly constrained, and working outward. The problem formulation may allow 
that the parts touch, but their interiors cannot overlap. The assembly planning 
problem with arbitrarily many parts is NP-hard [] . Interesting special cases have 
been considered. In one such case, for which parts can be removed by a sequence 
of straight-line paths, a polynomial-time algorithm is given in [775, 776]. 



7.2.2 Decoupled Planning 

Decoupled approaches first design motions for the robots while ignoring robot- 
robot interactions. Once these interactions are considered, the choices available 
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Figure 7.8: Assembly planning involves determining a sequence of motions that 
assembles the parts. The object shown here is assembled from the parts in Figure 



to each robot are already constrained by the designed motions. If a problem arises, 
these approaches are typically unable to reverse their commitments. Therefore, 
completeness is lost. Nevertheless, these approaches are quite practical, and in 
some cases completeness can be recovered. 

Prioritized planning A straightforward approach to decoupled planning is to 
sort the robots by priority, and plan for higher-priority robots first [233]. Lower- 
priority robots plan by viewing the higher-priority robots as moving obstacles. 
Suppose the robots are sorted as A 1 , ■ ■ ., A m , in which A 1 has the highest priority. 
The prioritized planning approach proceeds inductively as follows: 

Base case: Use any motion planning algorithm from Chapters 5 and 6 to 
compute a collision-free path, T\ : [0, 1] — > C} ree for A 1 . Compute a timing 
function, cri, for Ti, to yield fa — r i ° °"i : T — > C} ree . 

Inductive case: Suppose that fa, . . ., fa-i have been designed for A 1 , . . ., 
A 1-1 , and that these timing functions avoid robot-robot collisions between 
any of the first % — 1 robots. Formulation the first % — 1 robots as moving 
obstacles in W. For each t e T and j e {1, . . . ,i — 1}, the configuration, 
q j of each A> is Tj(<f>j(t)). This yields A> (Tj(4>j(t))) C W, which can be 
considered as a subset of the obstacle 0(t). Design a path, and timing 
function fa using any of the time-varying motion planning methods from 
Section 7.1. 

A special case of prioritized planning would be to design all of the paths, Ti, r 2 , 
. . ., r m , in the first phase, then formulate each inductive step as a velocity tuning 
problem. This yields a sequence of 2D planning problems, which can be solved 
quite easily. This will come at a greater expense, however, because the choices 
are even more constrained. The idea of preplanning paths, and even roadmaps, 
for all robots independently can lead to a powerful solution if the coordination of 
the robots is approached more carefully. This is the next topic. 
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Figure 7.9: If A 1 neglects the query for A 2 , then completeness is lost when using 
the prioritized planning approach. This example has a solution in general, but 
prioritized planning fails to find it. 



Fixed-path coordination Suppose that each robot, A 1 is constrained to follow 
a path Tj : [0, 1] — > C} ree , which can be computed using any ordinary motion plan- 
ning technique. For m robots, an m-dimensional state space called a coordination 
space can be formed which schedules the motions of the robots along their paths 
so that they will not collide [587]. One interesting feature to note carefully is that 
time will only be implicitly represented in the coordination space. The task will 
be to compute a path in the coordination space, from which explicit timings can 
be easily extracted. 

For m robots, the coordination space, X, is defined as the m-dimensional unit 
cube X = [0, l] m . Figure 7.10 depicts an example for which m = 3. The i th 
coordinate of X represents the domain, S\ = [0, 1], of the path r*. A state, x G X, 
therefore indicates the configuration of every robot. For each i, the configuration 
q % G C l is given by q l = Ti(xi). At state (0, ... ,0) G X, every robot is in its 
initial configuration, q\ nit = Tj(0), and at state (1, . . . , 1) G X, every robot is in 
its goal configuration q l goal = Tj(l). Any continuous path, a : [0,1] — > X, for 
which cr(0) = (0, . . . , 0) and cr(l) = (1, . . . , 1), will move the robots to their goal 
configurations. The path a does not even need to be monotonic, in contrast to 
prioritized planning. 

One important concern has been neglected so far. What prevents us from 
designing a as a straight-line path between the opposite corners of [0, l] m ? We 
have not yet taken into account the collisions between the robots. This forms 
an obstacle region, X f, s that must be avoided when designing a path through X. 
Thus, the task is to design a : [0, 1] — > Xf ree , in which Xf ree — X \ X obs . 

The definition of X b s is very similar to (7.8) and (7.10), except that here the 
state space dimension is much smaller. Each q l is replaced by a single parameter. 
The cylindrical structure, however, is still retained, as shown in Figure 7.10. Each 
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Figure 7.10: The obstacles that arise from coordinating m robots are always 
cylindrical. The set of all 2D projections completely characterizes X obs . 

cylinder of X obs is given by 

X% s = {(«i, ...,s m )ex | A'faist)) n A'^)) ± 0}, (7.11) 

which are combined to yield 

*o,s= |J X% s . (7.12) 

Standard motion planning algorithms can be applied to the coordination space 
because there is no monotonicity requirement on a. If 1) W = M 2 , 2) m = 2 (two 
robots), 3) the obstacles and robot are polygonal, and 4) the paths, Tj, are piece- 
wise linear, then X obs will be a polygonal region in X. This enables the methods 
of Section 6.2, for a polygonal C b s , to directly apply after the representation of 
X i) S is explicitly constructed. For m > 2, the multidimensional version of vertical 
cell decomposition, given for m = 3 in Section 6.3.3, can be applied. For general 
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coordination problems, cylindrical algebraic decomposition or Canny's algorithm 
can be applied. For the problem of robots in W = K 2 that either translate or 
move along circular paths, a resolution-complete planning method based on exact 
determination of X obs using special collision detection methods is given in [706]. 

For very challenging coordination problems, sampling-based solutions may 
yield practical solutions. Perhaps one of the simplest solutions is to place a grid 
over X and adapt the classical search algorithms, as described in Section 5.4.2 
[461, 587]. Other possibilities include using RDTs of Section 5.5, or if the multiple- 
query framework is appropriate, then the sampling-based roadmap methods of 5.6 
may be suitable. Methods for validating the path segments, which were covered in 
Section 5.3.4, can be adapted without trouble to the case of coordination spaces. 

Thus far, the particular speeds of the robots have been neglected. For expla- 
nation purposes, consider the case of m — 2. Moving vertically or horizontally in 
X holds one robot fixed while the other moves at some maximum speed. Moving 
diagonally in X moves both robots, and the relative speeds depends on the slope 
of the path. To carefully regulate these speeds, it may be necessary to reparam- 
eterize the paths by distance. In this case each axis of X represents the distance 
traveled, instead of [0, 1]. 



Fixed-roadmap coordination The fixed-path coordination approach still can- 
not solve the problem in Figure 7.9 if the paths are designed independently. For- 
tunately, fixed-path coordination can be extended to enable each robot to move 
over a roadmap other topological graph. This still yields a coordination space 
that has only one dimension per robot, and the resulting planning methods are 
much closer to being complete, assuming each robot utilizes a roadmap that has 
many alternative paths. There is also motivation to study this problem by itself 
because of autonomous guided vehicles ( AG Vs) , which often move in factories on 
a network of predetermined paths []. In this case, coordinating the robots is the 
planning problem, as opposed to being a simplification of Formulation 7.2.1. 

One way to obtain completeness for Formulation 7.2.1 is to design the indepen- 
dent roadmaps so that each robot has its own garage configuration. The conditions 
for a configuration, q 1 , be a garage for A 1 are: 1) while at configuration q l , it is 
impossible for any other robots to collide with it (i.e., for all coordination states 
for which the i th coordinate is q\ no collision occurs). 2) q % is always reachable 
by A 1 from q\ nit) and its presence at q l does not block other robots from reaching 
their garages. If each robot has a roadmap and a garage, and if the planning 
method for X is complete, then the overall planning algorithm is complete. If the 
planning method in X uses some weaker notion of completeness, then this is also 
maintained. For example, a resolution-complete sampling-based planner for X 
will be yield a resolution-complete approach to the problem in Formulation 7.2.1 
(or to the problem of planning for an AGV). 
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Figure 7.11: An example in which A 1 moves along three paths, and A 2 moves 
along one. 

Cube complex How is the coordination space represented when there are mul- 
tiple paths for each robot? It turns out that a cube complex is obtained, which is 
a special kind of singular complex (recall from Section 6.3.1). The coordination 
space for fixed paths can be considered as a singular m-simplex. For example, the 
problem in Figure 7.10, can be considered as a 3-simplex, [0, l] 3 — > X. In Sec- 
tion 6.3.1 the domain of a /c-simplex was defined using B k , a A;-dimensional ball; 
however, a cube, [0, l] k will also work because B k and [0, l] k are homeomorphic. 

For a topological space, X, k-cube (which is also a singular fc-simplex), D k , is 
a continuous mapping a : [0, l] k — > X. A cube complex is obtained by connecting 
together /c-cubes of different dimensions. Every /c-cube for k > 1 has 2k faces, 
which are (k — l)-cubes that are obtained as follows. Let (si, . . . ,Sk) denote a 
point in [0, l] k . For each i e {1, . . . , k}, one face is obtained by setting Sj = 0, 
and another is obtained by setting = 1. 

The cubes must fit together nicely, much in the same way that the simplexes 
of a simplicial complex were required to fit together. To be a cube complex, JC, be 
a collection of simplexes must satisfy these requirements: 

1. Any face CU-i of a cube d k e K, is also in /C. 

2. The intersection of the images of any two /c-cubes D' k e /C is either 
empty, or there exists some cube □« G /C for i < k, which is a common face 
of both O k and C\' k . 

Let Gj denote a topological graph (which may also be a roadmap) for robot 
A 1 . The graph can be designed by constructing paths of the form r : [0, 1] — > 
Cj ree . Before covering formal definitions of the resulting complex, consider Figure 
7.11, in which A 1 moves along three paths connected in a "T" junction, and 
A 2 moves along one path. In this case, there are three two-dimensional fixed- 
path coordination spaces, which are attached together along one common edge, as 
shown in Figure 7.12. The resulting cube complex is defined by three 2-cubes (i.e., 
squares), one 1-cube (i.e., line segment), and eight 0-cubes (i.e., corner points). 
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Figure 7.12: The coordination space that corresponds to the example in Figure 
7.11. 



Now suppose more generally that there are two robots, A 1 and A 2 , with asso- 
ciated topological graphs, Gi(Vi,Ei) and G 2 (V 2 ,E 2 ), respectively. Suppose that 
G and G' have ri\ and n 2 edges, respectively. A two-dimensional cube complex, 
/C, is obtained as follows. Let Tj denote the i th path of G±, and let rj denote the 
j th path of G 2 . A 2-cube (square) exists in /C for every way to select an edge 
from each graph. Thus, there are n\n 2 2-cubes, one for each pair (ti, t 2 ) such that 
Ti G Ei and r 2 G E 2 . The 1-cubes are generated for pairs of the form (v 1: e 2 ) for 
t>i G Vi and e 2 G i? 2 , or (ei,i> 2 ) for e\ G -Ei and t> 2 G V 2 . The 0-cubes (corner 
points) are reached for each pair (i>i,i> 2 ) such that V\ G V\ and v 2 G V 2 . 

If there are m robots, then an m-dimensional cube complex arises. Every 
m-cube corresponds to a unique combination of paths, one for each robot. The 
(m — l)-cubes are the faces of the m-cubes. This continues iteratively until the 
0-cubes are reached. 



Planning on the cube complex Once again, any of the planning methods of 
Chapters 5 and 6 can be adapted, but the methods are slightly complicated by 
the fact that X is a complex. To use sampling-based methods, a dense sequence 
should be generated over X. For example, if random sapling is used, then an 
m-cube can be chosen at random, followed by a random point in the cube. The 
local planning method (LPM) must take into account the connectivity of the 
cube complex, which requires recognizing when branches occur in the topological 
graph. Combinatorial methods must also take into account this connectivity. For 
example, a sweeping technique can be applied to produce a vertical decomposition, 
but the sweep-line (or sweep-plane) must sweep across the various m-cells of the 
complex. 
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Modes Layers 



Figure 7.13: A hybrid state space can be imagined as having layers of configuration 
spaces which are indexed by modes. 

7.3 Hybrid Systems: Discrete and Continuous 

Many important applications involve a mixture of discrete and continuous vari- 
ables. This results in a state space that is a Cartesian product of a finite set 
called the mode space, and a continuous set called the configuration space. The 
resulting hybrid system can be visualized as having layers of configurations spaces 
that are indexed by the modes, as depicted in Figure 7.13. The main application 
given in this section is manipulation planning; many others exist, especially when 
other complications such as dynamics and uncertainties are added to the problem. 
The framework of this section is inspired mainly from hybrid systems in the con- 
trol theory community [93], which is usually models mode-dependent dynamics. 
The main concern in this section is that the allowable robot motions and/or the 
obstacles depend on the mode. 

7.3.1 General Framework 

As illustrated in Figure 7.13, a hybrid system involves interaction between discrete 
and continuous spaces. The formal model will be first be given, followed by some 
explanation. This formulation can be considered as a synthesis of the components 
from discrete feasible planning, Formulation 2.2.1, and basic motion planning, 
Formulation 4.3.1. 

Formulation 7.3.1 (Hybrid System Motion Planning) 

1. The W and C components from Formulation 4.3.1 are included. 

2. A nonempty mode space, M, is defined which is a finite or countably infinite 
set of modes. 



3. A semi-algebraic obstacle region 0(m) is defined for each mode m G M. 
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4. A semi-algebraic robot, A(m), is defined for each m G M. It may be a rigid 
robot or a collection of links. It will be assumed here that the configuration 
space is not mode-dependent; only the geometry of the robot can depend 
on the mode. When the robot is transformed to configuration q, it will be 
denoted as A(q,m). 

5. A state space, X, is defined as the Cartesian product X = C x M. A state 
may be represented as x = (q, m), in which q G C and m G M. Let 



6. For each state, x G X, a finite action space, U(x). Let U denote the set of 
all possible actions (the union of U(x) over all x G X). 

7. A mode transition function, f m , which produces a mode, f(x,u) G M, for 
every x G X and u G U(x). It is that / is defined in a way that does not 
produce race conditions. This means that if q is fixed, the mode can change 
at most once. It then remains constant, and may only change if q is changed. 

8. A state transition function, f, which is derived from f m by changing the 
mode and holding the configuration fixed. Thus, /((?, m),u) = (q, f m (q, m)). 

9. A configuration Xi G Xf ree is designated as the initial state. 

10. A configuration X g G Xf ree is designated as the goal region. A region is 
defined instead of a point to facilitate the specification of a goal configuration 
that does not depend on the final mode. 

11. An algorithm must compute a (continuous) path, r : [0, 1] — > Xf ree and 
action function a : [0, 1] — > U such that r(0) = Xi and r(l) G X g , or 
correctly report that such a combination path and action function does not 
exist. 

The obstacle region and robot may or may be mode-dependent, depending on 
the problem. Examples of each will be given shortly. Changes in the mode depend 
on the action taken by the robot. From most states, it is usually assumed that 
a "do nothing" action exists, which leaves the mode unchanged. From certain 
states, the robot may select an action that changes the mode as desired. An 
interesting degenerate case exists, in which there is only a single action available. 
This means that the robot has no control over the mode from that state. If the 
robot arrives in such states, a mode change could automatically occur. 

The solution requirement is somewhat more complicated because both a path 
and action function need to be specified. It is insufficient to specify a path because 
it is important to know what action was applied to induce the correct mode 
transitions. Therefore, a, is used to indicate when these occur. Note that r and 



X obs = {(q,e) G X | A(q, m) fl 0(m) ^ 0}, 



(7.13) 



and X 



free 
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a are closely coupled; one cannot simply associate any a with a path r; it must 
correspond to the actions required to generate r. 

Example 7.3.1 (The Power of the Portiernia) In this example, a robot, A, 
is modeled as a square that translates in W = R 2 . Therefore, C = M 2 . The 
obstacle region in W is mode-dependent because of two doors, which are numbered 
"1" and "2" in Figure 7.14. a. In the upper left sits the portiernia, 2 which is able 
to give a key to the robot, if the robot is in a configuration as shown in Figure 
7.14.b. The portiernia only trusts the robot with one key at a time, which may 
be either for Door 1 or Door 2. The robot can return a key by revisiting the 
portiernia. As shown in Figures 7.14.C and 7.14.d, the robot can open a door by 
making contact with it, as long as it holds the correct key. 

The set, M, of modes needs to encode which key, if any, the robot holds, and 
also it must encode the status of the doors. The robot may either have: 1) the key 
to Door 1; 2) the key to Door 2; or 3) no keys. The doors may have the status: 1) 
both open; 2) Door 1 open, Door 2 closed; 3) Door 1 closed, Door 2 open; or 4) 
both closed. Considering keys and doors in combination yields 12 possible modes. 

If the robot is at a portiernia configuration as shown in Figure 7.14.b, then its 
available actions correspond to different ways to pick up and drop off keys. For 
example, if the robot is holding the key to Door 1, it can drop it off and pick 
up the key to Door 2. This changes the mode, but the door status and robot 
configuration must remain unchanged when f m and / are applied. The other 
locations in which the robot may change the mode are when in contact with Door 
1 or Door 2. The mode changes the mode only if the robot is holding the proper 
key. In all other configurations, the robot only has a single action (i.e., no choice), 
which keeps the mode fixed. 

The task is to reach the configuration shown in the lower right with dashed 
lines. The problem is solved by: 1) picking up the key for Door 1 at the portiernia; 
2) opening Door 1; 3) swapping the key at the portiernia to obtain the key for 
Door 2; 4) entering the innermost room to reach the goal configuration. As a 
final condition, we might want to require that the robot returns the key to the 
portiernia. 

Example 7.3.1 allows the robot to change the obstacles in O. The next example 
involves a robot that can change its shape. This is an illustrative example of a 
reconfigurable robot. The study of such robots has become a popular topic of 
research [154, 277, 410, 789]; the reconfiguration possibilities in that research area 
are much more sophisticated than the simple example considered here. 

Example 7.3.2 (Reconfigurable Robot) To solve the problem shown in Fig- 
ure 7.15, the robot must change its shape. There are two possible shapes, which 
correspond directly to the modes: elongated and compressed. Examples of each 
are shown in the figure. Figure 7.16 shows how Cf ree (m) appears for each of the 



2 These are groups of people who guard the keys at some public facilities in Poland. 




Figure 7.14: In the red area (the portiernia) pictured in the upper left, the robot 
can pick up and drop off keys that open one of two doors. If the robot contacts a 
door while holding the correct key, then it opens. 
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Figure 7.15: An example in which the robot must reconfigure itself to solve the 
problem. There are two modes: elongated and compressed. 



two modes. Suppose the robot starts initially from the left while in the elongated 
mode, and must travel to the last room on the right. This problem must be solved 
by: 1) reconfiguring the robot into the compressed mode; 2) passing through the 
corridor into the center; 3) reconfiguring the robot into the elongated mode; 4) 
passing through the corridor to the rightmost room. The robot has actions that 
directly change the mode by reconfiguring itself. To make the problem more in- 
teresting, we could require that robot is only able to reconfigure itself in specific 
locations (e.g., where there is enough clearance, or possibly at a location where 
another robot can assist it). 

The examples presented so far barely scratch the surface on the possible hybrid 
problems that can be defined. Many such problems can arise, for example, in the 
context making automated video game characters or digital actors. To solve these 
problems, standard motion planning algorithms can be adapted if they are given 
information about how to change the modes. Locations in X from which the mode 
can be changed may be expressed as subgoals. Much of the planning effort should 
then be focused on attempting to change modes, in addition to trying to directly 
reach the goal. Applying sampling-based methods requires the definition of a 
metric on X that accounts for both changes in the mode and the configuration. 
A wide variety of hybrid problems can be formulated, ranging from ones that 
are impossible to solve in practice to others that are straightforward extensions 
of standard motion planning. One particularly interesting class of problems, for 
which successful algorithms have been developed, will be covered next. 

7.3.2 Manipulation Planning 

This section presents an overview of manipulation planning; the concepts ex- 
plained here are mainly due to [9, 10]. Returning to Example 7.3.1, imagine that 
the robot must carry a key that is so large that it changes the connectivity of 
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A 

Elongated mode 



A 



Compressed mode 



Figure 7.16: When the robot changes its configuration, Cf ree (m) changes, enabling 
it to solve the problem. 
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Cf ree . For the manipulation planning problem, the robot will be called a manip- 
ulator that interacts with a part. In some configurations it is able to grasp the 
part and move it to other locations in the environment. The manipulation task 
usually requires moving the part to a specified location in W, without particular 
regard to how the manipulator can accomplish the task. The model considered 
here greatly simplifies the problems of grasping, stability, friction, mechanics, and 
uncertainties, and instead focuses on the geometric aspects (some of these issues 
will be addressed in Sections ??). For a thorough introduction to these other 
important aspects of manipulation planning, see [536]. 

Admissible configurations Assume that following components from Formu- 
lation 4.3.1 are used here: W, O, and A. For manipulator planning, A will be 
called the manipulator, and let C a refer to the manipulator configuration space. 
Let V denote a part, which is a rigid body modeled in terms of geometric prim- 
itives, as described in Section 3.1. It is assumed that V is allowed to undergo 
rigid body transformations, and will therefore have its own part configuration 
space, C p = SE{2) or C p = SE(3). Let q p G C p denote a part configuration. The 
transformed part model is denoted as V(q p ). 

The combined configuration space, C, is defined as the Cartesian product 

C = C a x C p , (7.14) 

in which each configuration q G C is of the form q = (q a ,q p ). The first step is 
to remove all configurations that must be avoided. Parts of Figure 7.17 show 
examples of these sets. Configurations for which the manipulator collides with 
obstacles are 

C a obs = {(q a ,q p )eC\A(q a )nO^®}- (7-15) 
The next logical step is to remove configurations for which the part collides with 
obstacles. It will make sense to allow the part to "touch" the obstacles. For 
example, this could model a part sitting on a table. Therefore, let 

C P obs = {(q a , q p ) e C | int{V{q p )) nO^}, (7.16) 

denote the open set for which the interior of the part intersects O. Certainly if 
the part penetrates O, the configuration should be avoided. 

Consider C\(C® bs UC p bs ). The configurations that remain ensure that the robot 
and part do not inappropriately collide with O. Next consider the interaction 
between A and V . The manipulator must be allowed to touch the part, but 
penetration will once again not be allowed. Therefore, let 

C% = (Of G C | A(q a ) mnt(V(q p )) ? 0}. (7.17) 
Removing all of these bad configurations yields 

C ad m = C\(C a obs UC p obs UCZ), (7.18) 
which will be called the set of admissible configurations. 
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Figure 7.17: Examples of several important subsets of C for manipulation plan- 
ning. 
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Stable and grasped configurations Two important subsets of C a dm will be 
used in the manipulation planning problem. See Figure 7.17. Let C v sta denote the 
set of stable part configurations, which are configurations at which the part can 
safely rest without any forces being applied by the manipulator. This means that 
a part cannot, for example, float in the air. It also cannot be in a configuration 
from which it might fall. The particular stable configurations depend on prop- 
erties such as the part geometry, friction, mass distribution, etc. These issues 
are not considered here. From this, let C sta Q C a d m be the corresponding stable 
configurations, defined as 

C sta = {(q a ,q p )eC adm \q p eC p ta }. (7.19) 

The other important subset of C a d m is the set of all configurations in which the 
robot is grasping the part (and is capable of carrying it, if necessary). Let this 
denote the grasped configurations, denoted by C gr C C a dm- For every configuration, 
{q a ,q p ) G C gr , we require that the manipulator touches the part. This means that 
A(q a )nV(q p ) 7^ (penetration is still not allowed because C gr C C adm ). In general 
many configurations at which A(q a ) contacts V(q p ) will not necessarily be in C gr . 
The conditions for a point to lie in C gr depend on the particular characteristics of 
the manipulator, the part, and the contact surface between them. For example, 
a typical manipulator would not be able to pick up a block by making contact 
with only one corner of it. This level of detail is not defined here; see [] for more 
information about grasping. 

We must always ensure that either x G C sta or x G C gr . Therefore, let 
Cf ree = C sta U C gr , to reflect the subset of C adm which will actually be allowed 
for manipulation planning. 

The mode space, M, contains two modes, which are named the transit mode 
and the transfer mode. In the transit mode, the manipulator is not carrying the 
part, which requires that q G C sta - In the transfer mode, the manipulator carries 
the part, which requires that q G C gr . Based on these simple conditions, the only 
way the mode can change is if q G C sta H C gr . Therefore, the manipulator is given 
two actions only when in these configurations. In all other configurations the 
mode must remain constant. For convenience, let C tra = C sta H C gr denote the set 
of transition configurations, which are the places in which the mode may change. 

Using the framework of Section 7.3.1, the mode space, M, and configuration 
space, C, are combined to yield the state space, X = C x M. Since there are only 
two modes, there are only two copies of C, one for each mode. State-based sets, 
X free , X tra , X sta , and X gr , are directly obtained from C free , C tra , C sta , and C gr by 
ignoring the mode. For example, 

X tra = {(q,m)eX\qeC tra }. (7.20) 

The sets Xf ree , X sta and X gr are similarly defined. 

The task can now be defined. An initial part configuration, q p nit G C sta and goal 
part configuration, q p oal G C sta are specified. Compute a path r : [0, 1] — > Xf ree 
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such that r(0) = qf nit and r(l) = q p goaV Furthermore, the action function o : 
[0, 1] — > C/ must be specified to indicate the appropriate mode changes whenever 
t(s) G X tm . A solution can be considered as an alternating sequence of transit 
paths and transfer paths, whose names follow from the mode. This is depicted in 
Figure 7.18. 




Figure 7.18: The solution to a manipulation planning problem alternates between 
the two layers of X. The transitions can only occur when x G X tra . 



Manipulation graph The manipulation planning problem can generally be 
solved by forming a manipulation graph, G m [9, 10]. Let a connected compo- 
nent of X tra refer to any connected component of Ct ra that is lifted into the state 
space by ignoring the mode. In other words, there are two copies of the con- 
nected component of C tra , one for each mode. For each connected component of 
X tra , a vertex exists G m . An edge is defined for each transfer path or transit 
path that connects two connected components of X tra - The general approach to 
manipulation planning then becomes: 

1. Compute the connected components of X tra - 

2. Compute the edges of G m by applying ordinary planning methods to each 
pair of vertices of G m . 

3. Apply planning methods to connect the initial and goal states to every 
possible vertex of X tra that can be reached without a mode transition. 

4. Search G m for a path that connects the initial and goal states. If one exists, 
then extract the corresponding solution as a sequence of transit and transfer 
paths (this implies the actions taken by the robot, to execute the required 
mode changes). 
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Figure 7.19: This example was solved in [177] using the manipulation planning 
framework and the visibility-based roadmap planner. It is very challenging be- 
cause the same part must be regrasped in many places. 
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Multiple parts The manipulation planning framework nicely generalizes to 
multiple parts, Vi, . . ., Vk- Each part has its own part configuration space, and 
C is formed by taking the Cartesian product of all part configuration spaces with 
the manipulator configuration space. The set C a d m is defined in a similar way, but 
now part-part collisions also have to be removed, in addition to part-manipulator, 
manipulator-obstacle, and part-obstacle collisions. The definition of C s t a requires 
that all parts are in stable configurations; the parts may even be allowed to stack 
on top of each other. The definition of C gr requires that one part is grasped and 
all other parts are stable. There are still two modes, depending on whether or 
not the manipulator is grasping a part. Transitions once again only occur when 
the robot is in Ct ra = Csta H C gr . The task involves moving all parts from one 
configuration to another. This achieved once again by defining a manipulation 
graph, and obtaining a sequence of transit paths (in which no parts move) and 
transfer paths (in which one part is carried, and all other parts are fixed). A 
challenging problem solved by a motion planning algorithm is shown in Figure 
7.19. 

Other generalizations are possible. A generalization to k robots would lead to 
2 k modes, in which each mode indicates whether or not each robot is grasping. 
Multiple robots could even grasp the same object. Another generalization could 
allow a single robot to grasp more than one object. 

7.4 Planning for Closed Kinematic Chains 

This sections continues where Section 4.4 finished. The subspace of C that re- 
sults from maintaining kinematic closure was defined and illustrated through 
some examples. Planning in this context requires that paths remain on a lower- 
dimensional variety for which a parameterization is not available. Many impor- 
tant applications require motion planning while maintaining these constraints. 
For example, consider a manipulation problem that involves multiple manipula- 
tors grasping the same object forms a closed loop, as shown in Figure 7.21. A 
loop exists because both manipulators are attached to the ground, which may 
itself be considered as a link. The development of virtual actors for movies and 
video games also involves related manipulation problems. Loops also arise in this 
context when more than one human limb is touching a fixed surface (e.g., two 
feet on the ground). A class of robots called parallel manipulators are designed 
with internal closed loops [550]. For example, consider the Stewart-Gough plat- 
form [296, 724] illustrated in Figure 7.20. The lengths of each of the six arms, 
A\i . . ., A% can be independently varied, while they remain attached via spherical 
joints to the ground and to the platform, which is A-j. Each arm can actually be 
imagined as two links that are connected by a prismatic joint. Due to the total 
of 6 degrees of freedom introduced by the variable lengths, the platform actually 
achieves the full 6 degrees of freedom (hence, some neighborhood in SE(3) is ob- 
tained for At). Planning the motion of the Stewart-Gough platform, or robots 
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Figure 7.20: An illustration of the Stewart- Gough platform (adapted from a figure 
made by Frank Sottile). 

that are based on the platform (the robot shown in Figure 7.29 that uses a stack 
of several of these mechanisms), requires handling many closure constraints that 
must be maintained simultaneously. Another application is computational biol- 
ogy, in which the configuration space of molecules is searched, many of which are 
derived from molecules that have closed, flexible chains of bonds []. 

7.4.1 Adaptation of Motion Planning Algorithms 

First, the planning problem will be precisely defined. All of the components from 
the general motion planning problem of Model 4.3.1 are included: W, O, A%, . . ., 
A r , C, qi, and q g . It is assumed that the robot is a collection of r links that are 
possibly attached in loops. 

It will be assumed in this section that C = W 1 . If this is not satisfactory, there 
are two ways to overcome the assumption. The first to represent SO (2) and 5*0(3) 
as S 1 and § 3 , respectively, and include the circle or sphere equation as part of the 
constraints considered here. This avoids the topology problems. The other option 
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Figure 7.21: Two or more manipulators manipulating the same object causes 
closed kinematic chains. Each black disc corresponds to a revolute joint. 

is to use abandon the restriction of using IR n , and instead use a parameterization 
of C that is of the appropriate dimension. To perform calculus on such manifolds, 
it differentiable structure is required, which is introduced in Section ??. In the 
presentation here, however, vector calculus on M. n is sufficient, which intentionally 
avoids these extra technicalities. 



Closure constraints The closure constraints introduced in Section 4.4, can be 
summarized as follows. There is a set V of polynomials fx, . . ., fk, which belong 
to Q[qi, . . . , q n ] and express the constraints for particular points on the links of 
the robot. The determination of these is detailed in Section 4.4.3. As mentioned 
above, polynomials that force points to lie on a circle or sphere in the case of 
rotations, may also be included in V. 
The closure space, C c [ a , is defined as 



C do = { g eC\\/f l eV, fi( qi , ...,g n ) = 0}, (7.21) 

which is an m-dimensional subspace of C that corresponds to all configurations 
that satisfy the closure constants. The obstacle set must also be taken into ac- 
count. Once again, C obs and Cf ree can be defined using (4.40). The feasible space, 
Cf ea is defined as Cf ea = C c i a fl C/ ree , which are the configurations that satisfy 
closure constraints and avoid collisions. 
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Let n denote the dimension of C. The motion planning problem then becomes 
the task of finding a path r : [0,1] — > Cf ea such that r(0) = g« and r(l) = g 9 . 
The new challenge is that there is no explicit parameterization of C/ ea , which is 
further complicated by the fact that m < n. 

Combinatorial methods Since the constraints are expressed with polynomi- 
als, it may not be surprising that the computational algebraic geometry methods 
of Section 6.4 can solve the general motion planning problem with closed kinematic 
chains. Either cylindrical algebraic decomposition or Canny's roadmap algorithm 
may be applied. As mentioned in Section 6.5.3, an adaptation of the roadmap 
algorithm which is optimized for problems in which m < n is given in [57]. 

Sampling-based methods Most of the methods of Section 5 are not easy to 

adapt because they require sampling in Cf ea , for which a parameterization does 
not even exist. If points in a bounded region of W 1 are chosen at random, the 
probability is zero that a point on Cf ea will be hit. Some incremental sampling 
and searching methods can, however, be adapted by the construction of a local 
planning method (LPM) that is suited for problems with closure constraints. The 
sampling-based roadmap methods require many samples to be generated directly 
on Cf ea . Section 7.4.2 presents some techniques that can be used to generate 
such samples for certain classes of problems, enabling the development of efficient 
sampling-based planners, and also improving the efficiency of incremental search 
planners. Before covering these techniques, we first present a method that leads 
to a more-general sampling based planner and is easier to implement. However, if 
designed well, planners based on the techniques of Section 7.4.2 are more efficient. 

C = W 1 *~ 




Figure 7.22: For the RDT, the samples can be drawn from a region in R n , the 
space in which C c \ is embedded. 

We now consider adapting the RDT of Section 5.5 to work for problems with 
closure constraints. Similar adaptations may be possible for other incremental 
sampling and searching methods, covered in Section 5.4, such as the randomized 
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Figure 7.23: For each sample a(i), the nearest point, q n G C is found, and then 
the local planner generates a motion that lies in the local tangent plane. The 
motion is the project of the vector from q n to a(i) onto the tangent plane. 

potential field planner. A dense sampling sequence, a, is generated over a bounded 
n-dimensional subset of M n , such as a rectangle or sphere, as shown in Figure 7.22. 
The samples are not actually required to lie on C c i because they do not necessarily 
become part of the topological graph, G. They mainly serve to pull the search tree 
in different directions. One concern in choosing the bounding region is to make 
it large enough to include C c \ (at least the connected component that includes 
<7i), but as small as possible while satisfying this requirement. Such bounds by 
carefully analyzing the motion limits for a particular linkage. 

Stepping along C c \ The RDT algorithm given Figure 5.27 can be applied 
directly; however, the stopping-configuration function in Line 4 must be 
changed to account for both obstacles and the constraints that define C c i Q . Figure 
7.23 shows the general approach, which is based on numerical continuation [?]. 
The nearest RDT vertex, q 6 C, to the sample at(i), is first computed. Let 
v = a(i) — q, which represents the direction in which an edge would be made from 
q if there were no constraints. A local motion is then computed by projecting v 
into the tangent plane of C c \ at the point q. Since C c i Q is generally nonlinear, the 
local motion will produce a point that is not precisely on C c i Q . Some numerical 
tolerance is generally accepted, and a small enough step is taken to ensure that the 
tolerance is maintained. The process iterates by computing v with respect to the 
new point, and moving in the direction of v projected into the new tangent plane. 
If the error threshold is reached, then motions must be executed in the normal 
direction to return to C c \ . This process terminates when progress can no longer 



7.4. PLANNING FOR CLOSED KINEMATIC CHAINS 



341 



be made, either due to the alignment of the tangent plane (nearly perpendicular 
to v) or due to an obstacle. This finally yields q s , the stopping configuration. 
The new path followed in Cf ea is no longer a "straight line" as was possible for 
some problems in Section 5.5; therefore, the approximate methods in Section 5.5.2 
should be used to create intermediate vertices along the path. 

In each iteration, the tangent plane computation is computed at some q G C c \ 
as follows. The differential configuration vector dq lies in the tangent space of a 
constraint fi(q) = if 



-dqi H ~ dq 2 + 



dqi 



dq 2 



+ dq n = 0. 

oq n 



(7.22) 



This leads to the following homogeneous system for all of the k polynomials in V 
that define the closure constraints: 



(dfM dh{q) 



dqi 


dq 2 


df2(q) 


df 2 (q) 


dqi 


dq 2 


df k (q) 


dh(q) 


dqi 


dq 2 



dfi(q) \ 
dq n 

dq n 

dfkjq) 
dq n I 



( dqi\ 

dq 2 
\dq n ) 



= 0. 



(7.23) 



If the rank of the matrix is m < n, then m configuration displacements can be 
chosen independently, and the remaining n — m parameters must satisfy Equation 
7.23. This can be solved using linear algebra techniques, such as singular value 
decomposition (SVD) [287], to compute an orthonormal basis for the tangent space 
at q. Let ei, . . ., e m , denote these n-dimensional basis vectors. The components 
of the motion direction are obtained from v = a(i) —q n . First, construct the inner 
products, ai — v • ei, a 2 — v • e 2 , ■ ■ ., a m — v • e m . Using these, the projection of v 
in the tangent plane is the n-dimensional vector w given by 



w 



m 
i 



(7.24) 



This is used as the direction of motion. The magnitude must be appropriately 
scaled to take sufficiently small steps. Because C c i a is nonlinear, the direction of 
motion will leave C c \ . A motion in the inward normal direction is then required 
to move back onto C c \ . 

Because the dimension, m, of C c \ is less than n, the procedure described 
above can only produce numerical approximations to paths in C c i Q . This problem 
also arises in implicit curve tracing in graphics and geometric modeling [331]. 
Therefore, each constraint fi(qi, . . . , q n ) = 0, is actually slightly weakened to 
\fi(qi, ■ ■ ■ , q n )\ < e for some fixed tolerance e > 0. This essentially "thickens" C c / G 
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so that its dimension is n. As an alternative to computing the tangent plane, 
motion directions can be sampled directly inside of this thickened region without 
computing tangent planes. This results in an easier implementation, but it is not 
as efficient [780]. 

7.4.2 Active-Passive Link Decompositions 

An alternative sampling-based approach is to perform an active-passive decom- 
position, which is used to generate samples in C c / Q by directly sampling active 
variables, and computing the closure values for passive variables using inverse 
kinematics methods. This method was introduced in [313], and subsequently im- 
proved through the development of the random loop generator in [177, 363]. The 
method serves as a general framework that can adapt virtually any of the methods 
of Section 5 to handle closed kinematic chains, and experimental evidence sug- 
gests that performance is better than the method of Section 7.4.1. One drawback 
is that the method requires some careful analysis of the linkage to determine the 
best decomposition and also bounds on its mobility. Such analysis exists for very 
general classes of linkages [177]; however, many challenging cases remain unsolved. 

Active and passive variables In this section, let C denote the configuration 
space obtained from all of the joint variables, instead of requiring C = R n , as in 
Section 7.4.1. This means that V includes only polynomials that encode closure 
constraints, as opposed to allowing constraints that represent rotations. Using 
the tree representation from Section 4.4.3, this means that C is of dimension n, 
arising from assigning one variable for every joint of the linkage in the absence 
of any constraints. Let q G C denote this vector of configuration variables. The 
active-passive decomposition partitions the variables of q to form two vectors, 
q a , called the active variables and q p , called the passive variables. The values of 
passive variables will always be determined from the active variables by enforcing 
the closure constraints and using inverse kinematics techniques to compute their 
values. If m is the dimension of C c i Q , then there are always m active variables and 
n — m passive variables. 

Temporarily, suppose that the linkage forms a single loop as shown in Figure 
7.24. One possible decomposition into active, q a , and passive, q p , variables is given 
in Figure 7.25. The linkage, when constrained to form a loop, has four degrees of 
freedom, assuing the bottom link is rigidly attached to the ground. This means 
that values can be chosen for four active joint angles, q a , and the remaining three, 
q p , can be derived from solving the inverse kinematics. To determine q p , note that 
there will be three equations and three unknowns. Unfortunately, these equations 
are nonlinear and fairly complicated. Nevertheless, efficient solutions exist for 
this case, and the three-dimensional generalization [529]. For a three-dimensional 
loop formed of revolute joints, there are six passive variables. The number, 3, of 
passive links in R 2 and the number 6 for M 3 arise from the dimensions of SE{2) 
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Figure 7.24: A chain of links in the plane. There are 7 links and 7 joints, which 
are constrained for form a loop. The dimension of C is 7, but the dimension of 
Cdo is 4. 




Figure 7.25: Three of the joint variables can be determined automatically by 
inverse kinematics. Therefore, 4 of the joints be designated as active, and the 
remaining 3 will be passive. 
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and SE(3), respectively. This is the freedom that is stripped away from system 
by enforcing the closure constraints. Methods for efficiently computing inverse 
kinematics in two and three dimensions are given in [20]. 




Figure 7.26: In this case, the active variables are chosen in a way that makes it 
impossible to assign passive variables that close the loop. 

There will be at most a finite number of solutions to the inverse kinematics 
problem, often leading to several choices for the passive variables. It could also be 
the case that for some assignments of active variables, there are no solutions to 
the inverse kinematics. An example is depicted in Figure 7.26. Suppose that we 
want to generate samples in C c i Q by selecting random values for q a , and then using 
inverse kinematics for q p . What is the probability that a solution to the inverse 
kinematics exists? For the example shown, it appears that most of time solutions 
would not exist. 

Loop generator The sampling method in [177, 363] (termed the random loop 
generator) greatly improves the chance of obtaining closure by iteratively restrict- 
ing the range on each of the active variables. The method requires that the active 
variables appear sequentially along the chain (i.e., there is no interleaving of ac- 
tive and passive variables). The m coordinates of q a are obtained sequentially as 
follows. First, compute an interval, 1%, of allowable values for qf. The interval 
serves as a loose bound in the sense for any value ql $ I\, it is known for certain 
that closure cannot be obtained. This is ensured by performing careful geomet- 
ric analysis of the linkage, which will be explained shortly. The next step is to 
generate a sample in g" G ii, which is accomplished in [177] by picking a random 
point in I\. Using the value q±, a bounding interval I2 is computed for allowable 
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values of q%. The value q% is obtained by sampling in I2. This process continues 
iteratively until I m and is obtained, unless it terminates early because some 
U = for i < m. If successful termination occurs, then the active variables q a are 
used to find values q p for the passive variables. This step still might fail, but the 
probability of success is now much higher. The method can also apply to linkages 
in which there are multiple, common loops, as in the Stewart-Gough platform, by 
breaking the linkage into a tree, and closing loops one at a time using the loop 
generator. The performance depends on how the linkage is decomposed [177]. 




Figure 7.27: If any joint able is possible, then the links sweep out a circle in the 
limit. 




Figure 7.28: If there are limits on the joint angles, then a tighter bound can be 
obtained for the reachability of the linkage. 
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Computing bounds on joint angles The main requirement for successful 
application of the method is the ability to compute bounds on how far a chain 
of links can travel in W over some range of variables. For example, for a planar 
chain that has revolute joints with no limits, the chain can sweep out a circle as 
shown in Figure 7.27. Suppose, it is known that the angle between links must 
remain between — 7r/6 and 71/ 6. A tighter bounding region can be obtained, as 
shown in Figure 7.28. Three-dimensional versions of these bounds, along with 
many necessary details, are included in [177]. These bounds are then used to 
compute U in each iteration of the sampling algorithm. 

Now that there is an efficient method that generates samples directly in C c i Q) 
it is straightforward to adapt any of the sampling-based planning methods of 
Chapter 5. In [177] many impressive results are obtained for challenging problems 
which have the dimension of C up to 97 and the dimension of C c \ up to 25; see 
Figure 7.29. These methods are based on applying these sampling technique to 
the RDTs of Section 5.5 and the visibility sampling-based roadmap of Section 
5.6.2. For these algorithms, the local planning method is applied to the active 
variables, and inverse kinematics algorithms are used for the passive variables 
in the path validation step. This means that inverse kinematics and collision 
checking are performed together, instead of only collision checking, as described 
in Section 5.3.4. 

One important issue that has been neglected in this section is the existence of 
kinematic singularities, which cause the dimension of C c / Q to drop in the vicinity 
of certain points. The methods presented here have assumed that solving the 
motion planning problem does not require passing through the singularity. This 
assumption is reasonable for robot systems that have many extra degrees of free- 
dom, but it is important understand that completeness is lost in general because 
the sampling-based methods do not explicitly handle these degeneracies. In a 
sense, they occur below the level of sampling resolution. For more information on 
kinematic singularities and related issues, see [550]. 

7.5 Folding Problems in Robotics and Biology 

A growing number of motion planning applications involve some form of folding. 
Examples include automated carton folding, computer-aided drug design, protein 
folding, modular reconfigurable robots, and even robotic origami. These problems 
are generally modeled as a linkage in which all bodies are connected by revolute 
joints. In robotics, self-collision between pairs of bodies usually must be avoided. 
In biological applications, energy functions replace obstacles. Instead of crisp 
obstacle boundaries, energy functions can be imagined as "soft" obstacles, in which 
a real- value is defined for every q e C, instead of defining a set C t, s C C. For a given 
threshold value, such energy functions can be converted into an obstacle region 
by defining C b s to be the configurations that have energy above the threshold. 
However, the energy function contains more information because such thresholds 
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Figure 7.29: Planning for the Logabex LX4 robot [?]. This solution was computed 
in less than a minute in [177] by applying active-passive decomposition to an RDT- 
based planner. In this example, the dimension of C is 97 and the dimension of 
Cdo is 25. 

are arbitrary. This section briefly shows some examples of folding problems and 
techniques from the recent motion planning literature. 




Carton Blank Carton Ready For Loading 

Figure 7.30: An important packaging problem is to automate the folding of a 
perforated sheet of cardboard into a carton. 

Carton folding An interesting application of motion planning to the automated 
folding of boxes is presented in [509]. Figure 7.30 shows a carton in its original 
flat form and in its folded form. As shown in Figure 7.31, the problem can be 
modeled as tree of bodies connected by revolute joints. Once this model has been 
formulated, many methods from Chapters 5 and 6 can be adapted for this problem. 
In [509], a planning algorithm optimized particularly for box folding is presented. 
It is an adaptation of an approximate cell decomposition algorithm developed 
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Figure 7.31: The carton can be cleverly modeled as a tree of bodies that are 
attached by revolute joints. 




Figure 7.32: A folding sequence that was computed using the algorithm in [509]. 
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for kinematic chains in [505]. Its complexity is exponential in the degrees of 
freedom of the carton, but gives good performance on practical examples. One 
such solution that was found by motion planning is shown in Figure 7.32. To 
use these solutions in a factory, the manipulation problem has to be additionally 
considered. For example, as demonstrated in [509], a manipulator arm robot can 
be used in combination with a well-designed set of fixtures. The fixtures help hold 
the carton in place while the manipulator applies pressure in the right places, 
which yields the required folds. Since the feasibility with fixtures depends on 
the particular folding path, the planning algorithm generates all possible distinct 
paths from the initial, flat configuration. 

Simplifying knots A knot is defined as a closed curve that does not intersect 
itself, is embedded in R 3 , and cannot be untangled to produce a simple loop. If 
the knot is allowed to intersect itself, then any knot can be untangled; therefore, 
a careful definition of what it means to untangle a knot is needed. For a closed 
curve, r : [0, 1] — > R 3 , embedded in R 3 (it cannot intersect itself), let the set 
R 3 \ r([0, 1]) of points not reached by the curve be called the ambient space of 
t. In knot theory, an ambient isotopy between two closed curves, T\ and r 2 , 
embedded in R 3 , is a homeomorphism between their ambient spaces. Intuitively, 
this means that T\ can be warped into r 2 without allowing any self-intersections. 
Therefore, determining whether two loops are equivalent seems closely related to 
motion planning. Such equivalence gives rise to groups that characterize the space 
of knots, and are closely related to the fundamental group described in Section 
4.1.3. For more information on knot theory, see [3, 328, 382]. 

A motion planning approach was developed in [424] to determine whether a 
closed curve is equivalent to the unknot, which is completely untangled. This 
can be expressed as a curve that maps onto S 1 embedded in R 3 . The algorithm 
takes as input a knot expressed as a circular chain of line segments embedded in 
R 3 . In this case, the unknot can be expressed as a triangle in R 3 . One of the 
most challenging examples solved by the planner is shown in Figure 7.33. The 
planner is sampling-based and shares many similarities with the RDT algorithm 
of Section 5.5, and the Ariadne's clew and expansive space planners described 
in Section 5.4.4. Since the task is not to produce a collision-free path, there are 
several unique aspects in comparison to motion planning. An energy function is 
defined is defined over the collection of segments to try to guide the search toward 
simpler configurations. There are two kinds of local operations that are made by 
the planner: 1) Try to move a vertex toward a selected subgoal in the ambient 
space. This is obtained by using random sampling to grow a search tree. 2) Try 
to delete a vertex, and connect the neighboring vertices by a straight line. If 
no collision occurs, then the knot has been simplified. The algorithm terminates 
when it is unable to further simpliy the knot. 
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Figure 7.33: The planner in [424] untangles the famous Ochiai unknot benchmark 
in a few minutes on a standard PC. 
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Figure 7.34: A 3D model of protein-ligand docking. 



Drug Design A sampling-based motion planning approach to pharmaceutical 
drug design is taken in [455]. The development of a drug is a long, incremental 
process, typically requiring years of research and experimentation. The goal is 
to find a relatively small molecule (called a ligand) typically comprising a few 
dozen atoms, that docks with a receptor cavity in a specific protein [473]; Figure 
7.34 shows an illustration. Examples of drug molecules were given in Figure 3.22. 
Protein-ligand docking can stimulate or inhibit some biological activity, ultimately 
leading to the desired pharmacological effect. The problem of finding suitable 
ligands is complicated due to both energy considerations and the flexibility of 
the ligand. In addition to satisfying structural considerations, factors such as 
synthetic accessibility, drug pharmacology and toxicology greatly complicate and 
lengthen the search for the most effective drug molecules. 

One popular model used by chemists in the context of drug design is a phar- 
macophore, which serves as a template for the desired ligand [167, 249, 275, 681]. 
The pharmacophore is expressed as a set of features that an effective ligand should 
possess and a set of spatial constraints among the features. The features can be 
specific atoms, centers of benzene rings, positive or negative charges, hydrophobic 
or hydrophilic centers, hydrogen bond donors or acceptors, and others. These fea- 
tures generally require that parts of the molecule must remain fixed in M 3 , which 
induces kinematic closure constraints. These features are developed by chemists 
to encapsulate the assumption that ligand binding is due primarily to the interac- 
tion of some features of the ligand to "complementary" features of the receptor. 
The interacting features are included in the pharmacophore, which is a template 
screening candidate drugs, and the rest of the ligand atoms merely provide a scaf- 
fold for holding the pharmacophore features in their spatial positions. Figure 7.35 
illustrates the pharmacophore concept. 

Candidate drug molecules (ligands), such as the ones shown in Figure 3.22, 
can be modeled as a tree of bodies as shown in Figure 7.36. Some bonds can 
rotate, which yields a revolute joint in the model. Other bonds must remain 
fixed. The drug design problem amounts to searching the space of configurations 
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Figure 7.35: A pharmacophore is a model used by chemists to simplify the in- 
teraction process between a ligand (candidate drug molecule) and a protein. It 
often amounts to holding certain features of the molecule fixed in R 3 . In this 
example, the positions of three atoms must be fixed relative to the atom to which 
the coordinate frame is assigned. It is assumed that these features interact with 
some complementary features in the cavity of the protein. 



(called conformations) to try to find a low-energy configuration that also places 
certain atoms in specified locations in R 3 . This additional constraint arises from 
the pharmacophore, and causes the planning to occur on C c / Q from Section 7.4 
because the pharmacophores can be expressed as closure constraints. 

An energy function serves a purpose similar that of a collision detector. The 
evaluation of a ligand for drug design requires determining whether it can achieve 
low-energy conformations that satisfy the pharmacophore constraints. Thus, the 
task is different from standard motion planning in that there is no predetermined 
goal configuration. One of the greatest difficulties is that the energy functions 
are extremely complicated, nonlinear, and empirical. Here is an example used in 
[455]: 

<Q)= Y. bonds \K b {R-R') 2 + Y. an9 \K a {a-a'f+ 



Etor Sl onsK d [l + COs(p9-9')] + 



(7.25) 
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12 

'ij \ I u ij 



I c i c 3 
erij 



The energy if q G C accounts for torsion-angle deformations, van der Waals po- 
tential, and Coulomb potentials. In (7.25), the first sum is taken over all bonds, 
the second over all bond angles, the third over all rotatable bonds, and the last 
sum of is taken over all pairs of atoms. The variables are: 1) force constants, 



7.5. FOLDING PROBLEMS IN ROBOTICS AND BIOLOGY 



353 



Anchor Atom 




Figure 7.36: The modeling of a flexible molecule is similar to that of a robot. One 
atom is designated as the root, and the remaining bodies are arranged in a tree. 
If there are cyclic chains in the molecules, then constraints as described in Section 
4.4 must be enforced. Typically, only some bonds are capable of rotation, while 
others must remain rigid. 

Kf,, K a , and K^, 2) the dielectric constant, e; 3) a periodicity constant, p; 4) the 
Lennard- Jones radii, a^; 5) well depth, e^; 6) partial charge, q; 7) measured bond 
length, R; 8) equilibrium bond length, R'; 9) measured bond angle, a; 10) equi- 
librium bond angle, a'; 11) measured torsional angle, 9; 12) equilibrium torsional 
angle, 6'; 13) distance between atom centers, r^-. Although the energy expression 
is very complicated, it only depends on the configuration variables; all others are 
constants that are estimated in advance. 

Protein folding In computational biology, the problem of protein folding shares 
many similarities with drug design in that the molecules have rotatable bonds and 
energy functions are used to express good configurations. The problems are much 
more complicated, however, because the protein molecules are generally much 
larger than drug molecules. Instead of a dozen degrees of freedom, which is typi- 
cal for a drug molecule, proteins have hundreds or thousands of degrees of freedom. 
When proteins occur in nature, they are usually in a folded, low-energy configu- 
ration. The structure problem involves determining precisely how the protein is 
folded so that its biological activity can be completely understood. In some stud- 
ies, biologists are even interested in the pathway that a protein takes to arrive in 
its folded state [14, 15]. This leads directly to an extension of motion planning 
that involves arriving at a goal state in which the molecule is folded. In [14, 15], 
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sampling-based planning algorithms were applied to compute folding pathways 
for proteins. The protein starts in an unfolded configuration and must arrive in a 
specified folded configuration without violating energy constraints along the way. 
Figure 7.37 shows an example from [15]. That work also draws interesting con- 
nections between protein folding and box folding, which was covered previously. 




Figure 7.37: Protein folding for a polypeptide, computed by a sampling-based 
roadmap planning algorithm [14] 



7.6 Coverage Planning 

Imagine automating the motion of a lawnmower for an estate that has many ob- 
stacles, such as a house, trees, garage, and an complicated property boundary. 
What are the best zig-zag motions for the lawnmower? Can the amount of redun- 
dant traversals be minimized? Can the number of times the lawnmower needs to 
be stopped and rotated be minimized? This is one example of coverage planning, 
which is motivated by applications such as lawn mowing, automated farming, 
painting, vacuum cleaning, and mine sweeping. A survey of this area appears in 
[161]. Even for a region in W = M. 2 , finding an optimal-length solution to coverage 
planning is NP-hard, by reduction to the closely-related Traveling Salesman Prob- 
lem [22, 563]. Therefore, we are willing to tolerate approximate or even heuristic 
solutions to the general coverage problem, even in IR 2 . 

Boustrophedon decomposition One approach to the coverage problem is to 
decompose Cf ree into cells, and perform boustrophedon (from Greek "ox turning") 
motions in each cell as shown in Figure 7.38 [163]. It is assumed that the robot is 
a point in W = I 2 , but it carries a tool of thickness e that hangs evenly over the 
sides of the robot. This enables it to paint or mow part of Cf ree up to distance e/2 
from either side of the robot as it moves forward. Such motions are often used in 
printers to minimize the number of carriage returns. 

If C b s is polygonal, a reasonable decomposition can be obtained by adapting 
the vertical decomposition method of Section 6.2.2. In that algorithm, critical 
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Figure 7.38: An example of the ox plowing motions. 



events were defined for several cases, some of which are not relevant for the bous- 
trophedon motions. The only events that need to be handled are shown in Figure 
7. 39. a [160]. This produces a decomposition that has fewer cells, as shown in Fig- 
ure 7.39. b. Even through the cells are nonconvex, they can always be sliced nicely 
into vertical strips, which makes them suitable for boustrophedon motions. The 
original vertical decomposition could also be used, but the extra cell boundaries 
would cause unnecessary repositioning of the robot. A similar method, which 
furthermore optimizes the number of robot turns, is presented in [349]. 

Spanning tree covering An interesting approximate method can be made by 
placing a tiling of squares inside of C/ ree , and computing the spanning tree of the 
resulting connectivity graph [268, 269]. Suppose again that Cf ree is polygonal. 
Consider the example shown in Figure 7.40. The first step is to tile the interior 
of Cf ree with squares, as shown in Figure 7.41. Each square should be of width e. 
Next, construct a roadmap, G, by placing a vertex in the center of each square, 
and by defining an edge that connects the centers of each pair of adjacent cubes. 
The next step is to compute a spanning tree of G. This is a connected subgraph 
that has no cycles and touches every vertex of G, and can be easily computed 
in 0(n) time, if G has n edges [539]. There are many possible spanning trees, 
and a criterion can be defined and optimized to induce preferences. One possible 
spanning tree is shown Figures 7.42 and 7.43. 

Once the spanning tree is made, the robot path is obtained by starting at 
a point near the spanning tree and following along its perimeter as shown in 
Figure 7.44. This path can be precisely specified as shown in Figure 7.45. Double 
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Figure 7.39: a) Only the first case from Figure 6.2 is needed: extend upward 
and downward. All other cases are neglected, b) The resulting decomposition is 
shown, which has fewer cells than that of the vertical decomposition in Figure 6.3. 




Figure 7.40: An example used for spanning tree covering. 

the resolution of the tiling, and form the corresponding roadmap. Part of the 
roadmap will correspond to the spanning tree, but also included is a loop path that 
surrounds the spanning tree can be extracted. This path visits the centers of the 
new squares. The resulting path for the example of Figure 7.40 is shown in Figure 
7.46. In general, the method yields an optimal route, once the approximation is 
given. A bound on uncovered area due to approximation is given in [268] . Versions 
of the method that do not require an initial map are also given in [268, 269]; this 
involves reasoning about information spaces, which are covered in Chapter 11. 

7.7 Optimal Motion Planning 

This section can be considered transitional in many ways. The main concern so far 
with motion planning has been feasibility as opposed to optimality. This placed 
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Figure 7.41: The first step is to tile the interior with squares. 

the focus on finding any solution, rather than adding the additional requirement 
that a solution be optimal. In later parts of the book, especially as uncertainty 
is introduced, optimality will receive more attention. Even the most basic forms 
of decision theory, the topic of Chapter 9, center on making optimal choices. The 
requirement of optimality in very general settings usually requires an exhaustive 
search over the state space, which amounts to computing continuous cost-to-go 
functions. Once such functions are known, a feedback strategy is obtained, which 
is much more powerful than having only a path. Thus, optimality will also appear 
frequently in the design of feedback strategies because it sometimes comes at no 
additional cost. This will become clearer in Chapter 8. The quest for optimal 
solutions also raises interesting issues about how to approximate a continuous 
problem as a discrete problem. The interplay between time discretization and 
space discretization become very important in relating continuous and discrete 
planning problems. 

7.7.1 Optimality for One Robot 

Euclidean shortest paths One of the most straightforward notions of opti- 
mality is Euclidean shortest paths in R 2 or M 3 . Suppose that A is a rigid body 
that translates only in either W = M 2 or W = M 3 , which contains an obstacle 
region O C W. Recall that normally, C/ ree , is an open set, which means that one 
can take any path, r : [0, 1] — > Cf ree , and make it shorter. Therefore, shortest 
paths for motion planning must be considered on the closure, cl(Cf ree ), which al- 
lows the robot to make contact with the obstacles; however, their interiors must 
not intersect. 
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Figure 7.43: The resulting spanning tree is shown without the grid. 
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Figure 7.44: A circular path is made that follows the perimeter of the spanning 
tree. 



■ Ill 

Figure 7.45: A circular path is made by doubling the resolution and following the 
perimeter of the spanning tree. 
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Figure 7.46: The resulting spanning tree covering for the problem in Figure 7.40. 




Figure 7.47: For a polyhedral environment, the shortest paths do not have to cross 
vertices. Therefore, the shortest path roadmap method from Section ?? does not 
extend to three dimensions. 

For the case in which Cobs is a polygonal region, the shortest path roadmap 
method of Section 6.2.4 has already been given. This can be considered as a 
kind of multiple-query approach because the roadmap completely captures the 
structure needed to construct the shortest path for any query. It is possible to 
make a single-query algorithm using the continuous Dijkstra paradigm [562, 321]. 
This method propagates a wavefront from q iy and keeps track of critical events 
in maintaining the wavefront. As events occur, the wavefront becomes composed 
of wavelets, which are arcs of circles centered on obstacle vertices. The possible 
events that can occur are: 1) a wavelet disappears, 2) a wavelet collides with an 
obstacle vertex, 3) a wavelet collides with another wavelet, or 4) a wavelet collides 
with a point in the interior of an obstacle edge. The method can be made to run 
in time O(nlgn) and uses O(nlgn) space. A roadmap is constructed that uses 
0(n) space. 
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Such elegant methods leave the impression that finding shortest paths is not 
very difficult, but unfortunately, they do not generalize nicely to M 3 and a poly- 
hedral C bs- Figure 7.47 shows a simple example in which the shortest path does 
not have to cross a vertex of C \, s . It may cross anywhere in the interior of an 
edge; therefore, it is not clear where to draw the bitangent lines that would form 
the shortest path roadmap. The lower bounds for this problem are also discour- 
aging. It was shown in [122] that the the three-dimensional shortest path problem 
in a polyhedral environment is NP-hard. Most of the difficulty arises because of 
the precision required to represent three-dimensional shortest paths. Therefore, 
efficient polynomial-time approximation algorithms exist [158, 159, 606]. 

General optimality criteria It is difficult to even define optimality for more 
general configuration spaces. What does it mean to have a shortest path in SE(2) 
or SE(3)? Consider the case of planar, rigid robot that can translate or rotate. 
One path could try to minimize amount of rotation, while another tries to min- 
imize the amount of translation. Without more information, there is no clear 
choice. Ulam's distance is one possibility, which is to minimize the distance trav- 
eled by k fixed points [358]. In Chapter ??, differential models will be introduced, 
which greatly facilitate the natural expression of optimal paths. For example, the 
shortest paths for a car-like robot are shown in Section ??, but these require a 
precise specification of the constraints on the motion of a car (it is naturally more 
costly to move a car sideways than forward; hence, parallel parking is difficult). 

In this section, we take some steps in this direction to formulate optimal motion 
planning problems, to provide a kind of smooth transition toward the later con- 
cepts. Up to now, actions were used in Chapter 2 for discrete planning problems, 
but were successfully avoided for basic motion planning by directly describing 
paths that map into Cf ree . It will be convenient to use them once again. Recall 
that they were convenient for defining costs and optimal planning in Section 2.4. 

To avoid for now the complications of differential equations, consider making 
an approximate model of motion planning in which every path must be composed 
of a sequence of shortest-path segments in C/ ree . Most often these will be line 
segments; however, for the case of 5*0(3), circular segments obtained by spherical 
linear interpolation may be preferable. Consider extending Formulation 2.4.2 from 
Section 2.4.2 to the problem of motion planning. 

Let the configuration space, C be embedded in R m (i.e. C C M. m ). An action 
will be defined shortly as an m-dimensional vector. Given a scaling constant, e 
and a configuration, q, an action, u, will produce a new configuration, q' = q + eu. 
This can be considered as a configuration transition equation, q' = f(q,u). The 
path segment represented by the action u is the shortest path (usually a line 
segment) between q and q'. Following Section 2.4, let tik denote a K-step plan, 
which is a sequence (u±, u 2 , ■ ■ ., uk) of K actions. Note that if tc k and q t are 
given, then a sequence of states, q±, q 2 , . . ., qx+i, can be derived using the state 
transition equation, /. Initially, qi = qi, and each following state is obtained by 
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qk+\ = f(qk,Uk). This also leads to a path [0, 1] — > C. 

The approximate optimal planning problem is now formalized as follows: 

Formulation 7.7.1 (Approximate Optimal Motion Planning) 

1. A number, K, of stages. The current state, k, is indicated by a subscript, 
to obtain q k and u k . 

2. The following components are defined the same as in Model 4.3.1: W, O, A, 
C, C obs , Cfree, and qi. It is assumed that C C M m , for some positive integer 

TO. 

3. For each q e C, a possibly-infinite action space, U(q). Each w G £/ is an 
TO-dimensional vector. 

4. A positive constant, e > 0, called the siep size. 

5. A configuration transition function, f(q,u) = q + eu, in which q + ew is 
computed by vector addition on M m . 

6. Instead of a goal state, a goal region, X G is defined. 

7. Let L denote a real-valued, additive cost (or loss) functional, which is ap- 
plied to a if-step plan, i\ K . This means that the sequence, (u±, . . . ,uk), of 
actions and the sequence, (qi, . . . , qx+i), of configurations may appear in an 
expression of L. For convenience, let F = K + 1, to denote the final state 
(note that the application of uk advances the stage to K + 1). The cost 
functional is 

K 

L(n K ) = Y,K<lk,u k ) + l F (q F ). (7.26) 

k=l 

The final term, If{qf), is outside of the sum, and is defined as If{qf) — 
if qp € Xq, and If^f) = oo, otherwise. Just as in Formulation 2.4.2, it is 
assumed that K is not necessarily a constant. 

8. Each U(q) contains a special termination action, u F , which behaves the 
same way as in Formulation 2.4.2. If ut is applied to q k , at stage k, then the 
action is repeatedly applied forever, the configuration remains in q k forever, 
and no more cost accumulates. 

Formulation 7.7.1 can be used to define a variety of optimal planning prob- 
lems. The parameter e can be considered as the resolution of the approximation. 
In many formulations it can be interpreted as a time step, e = At; however, note 
that no explicit time reference is necessary because the problem only requires con- 
structing a path though C/ ree . As e approaches zero, the formulation approaches 
an exact optimal planning problem. To properly express the exact problem, dif- 
ferential equations are needed. This is deferred until Section ??. 
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Figure 7.48: Under the Manhattan (Li) motion model, all monotonic paths that 
follow the grid directions have equivalent length. 




Manhattan Independent Euclidean 

Joint 

Figure 7.49: Depictions for the actions sets, U(q) for Examples 7.7.1, 7.7.2, and 
7.7.3. 

Example 7.7.1 (Manhattan Motion Model) Suppose that the U(q) is de- 
fined as a set of 2m vectors in which only one component is nonzero and must 
take the value 1 or — 1. For example, if m — 2, then 

U(q) = {(1,0), (-1,0), (0,-1), (0,1)}. (7.27) 

When used in the configuration transition equation, this set of actions produces 
"up" , "down" , "left" , and "right" motions. The action set for this example and the 
following two examples are shown in Figure ?? for comparison. The loss l(qk,Uk) 
is defined to be 1 for all q^ G C/ ree and u^. If q& G C Q i> s , then l(qk,Uk) = oo. 
Note that the set of configurations reachable by these actions will lie on a grid, 
in which the spacing between 1-neighbors is e. This corresponds to a convenient 
special case in which time-discretization (implemented by e) leads to a nice space- 
discretization. Consider Figure 7.48. It is impossible to take a shorter path along 
a diagonal because the actions do not allow it. Therefore, all monotonic paths 
along the grid produce the same costs. 

Optimal paths can be obtained by simply applying the dynamic programming 
algorithms of Chapter 2. This example provides a nice unification of concepts from 
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Section 2.3, which introduced grid search, and Section 5.4.2, which explained how 
to adapt search methods to motion planning. In the current setting, only algo- 
rithms that produce optimal solutions on the corresponding graph are acceptable. 

This form of optimization might not seem to relevant since it does not rep- 
resent the Euclidean shortest path problem for R 2 . This next model adds more 
actions, and does correspond to an important class of optimization problems in 
robotics. ■ 



Example 7.7.2 (Independent Joint Model) Now suppose that U(q) is the 
set of all 3 m vectors for which every element is either —1, 0, or 1. Now a path 
can be taken along any diagonal. This still does not change the fact that all 
reachable configurations lie on a grid. Therefore, the standard grid algorithms 
can once again be applied. The difference is that now there are now 3 n — 1 
edges emanating from every vertex, as opposed to 2n in Example 7.7.1. This 
model is appropriate for robots that are constructed from a collection of links 
attached by revolute joints. If each joint is operated independently, then it makes 
sense that each joint could either be moved forward, moved backwards, or held 
stationary. This corresponds exactly to the actions. However, this model cannot 
nicely approximate Euclidean shortest paths; this motivates the next example. ■ 



Example 7.7.3 (Euclidean Motion Model) To approximate Euclidean short- 
est paths, let U(q) = § m_1 , in which S™ -1 is the m-dimensional unit sphere, 
centered at the origin of R m . This means that in k stages, any piecewise-linear 
path in which each segment has length e can be formed by a sequence of inputs. 
Therefore, the set of reachable states is no longer confined to a grid. Consider 
taking e = 1, and pick any point, such as (n,ir) G R 2 . How close can you come 
to this point? It turns out that the set of points reachable with this model is 
dense in R m if obstacles are neglected. This means that we can come arbitrarily 
close to any point in R m . Therefore, a finite grid cannot be used to represent the 
problem. Approximate solutions can still be obtained by computing a numerical 
approximation to an optimal cost-to-go defined over C. This approach will be 
presented in Section ??. 

One additional issue for this problem is the precision defined for the goal. If 
the goal region is very small relative to e, then complicated paths may have to be 
taken to arrive precisely at the goal. ■ 



Example 7.7.4 (Weighted Region Problem) In outdoor and planetary nav- 
igation applications, it does not make sense to define obstacles in the crisp way 
that has been used so far. It is more convenient to associate a cost with each 
patch of terrain, which indicates the estimated difficulty of traversal. This is 
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sometimes considered as a "gray scale" model of obstacles. The model can be eas- 
ily captured in the cost term l(qk,Uk). The action spaces can be borrowed from 
Examples 7.7.1 or 7.7.2. A grid-based search algorithm called D* is introduced 
in [?] which generates optimal navigation plans for this problem, assuming that 
the terrain is initially unknown. Theoretical bounds for optimal weighted-region 
planning problems are given in [563]. ■ 



7.7.2 Multiple-Robot Optimality 

Suppose that there are two robots as shown in Figure 7.50. There is just enough 
room to enable the robots to translate along the corridors. Each will like to 
arrive at the bottom, as indicated by arrows; however, only one at a time can 
pass through the horizontal corridor. Suppose that at any instant each robot can 
either be on or off. When it is on, it moves at its maximum speed, and when it 
is off, it is stopped. 3 Now suppose that each robot would like to reach its goal as 
quickly as possible. This means each would like to minimize the total amount of 
time that it is off. In this example, there appears to be only two sensible choices: 
1) A\ stays on and moves straight to its goal while A2 is off just long enough to 
let Ai pass, and then moves to its goal. 2) The opposite situation occurs, in which 
A2 stays on and A\ must wait. Note that when a robot waits, there are multiple 
locations at which it could wait and still yield the same time to reach the goal. 
The only important information is how long the robot was off. 

Thus, the two intersecting strategies are that either A2 is off for some amount 
of time, t Q ff > 0, or A\ is off for time t Q ff. Consider a vector of costs of the form 
(L 1 ,/, 2 ), in which each component represents the cost for each robot. The costs 
of the strategies could be measured in terms of time wasted by waiting. This 
yields (0,t o //) and (t o //,0) for the cost vectors associated with the two strategies 
(we could equivalently define cost to be the total time traveled by each robot; 
the time on is the same for both robots and can be subtracted from each for 
this simple example). The two strategies are better than or equivalent to any 
others. Strategies with this property are called nondominated or Pareto optimal. 
For example, if A2 waits 1 second too long for Ai to pass, then the resulting costs 
are (0, i // + 1), which is clearly worse than (0, t D //)- The resulting strategy is not 
Pareto optimal. 

Another way to solve the problem is to scalarize the costs by mapping them 
to a single value. For example, we could find strategies that optimize the average 
wasted time. In this case, one of the two best strategies would be obtained, 
yielding t Q ff average wasted time. However, no information is retained about 
which robot had to make the sacrifice. Scalarizing the costs usually imposes some 
kind of artificial preference or prioritization among the robots. Ultimately, only 



3 This model allows infinite acceleration. Imagine that the speeds are slow enough to allow 
this approximation. If this is still not satisfactory, then jump ahead to Chapter 13. 
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Figure 7.50: There are two Pareto-optimal coordination strategies for this prob- 
lem, depending on which robot has to wait. 

one strategy can be chosen, which might make it seem inappropriate to maintain 
multiple solutions. However, finding and presenting the alternative Pareto optimal 
solutions could provide valuable information if, for example, these robots are 
involved in a complicated application that involves many other time-dependent 
processes. Presenting the Pareto optimal solutions is equivalent to discarding all 
of the worse strategies, and showing the best alternatives. In some applications, 
priorities between robots may change, and if a scheduler of robots has access to 
the Pareto optimal solutions, it is easy to change priorities by switching between 
Pareto optimal strategies without having to generate new plans each time. 

Now the Pareto optimality concept will be made more precise and general. 
Suppose there are m robots, A 1 , A m . Let 7 refer a motion strategy that 
gives the paths and timing functions for all robots. For A 1 , let D denote its 
cost-functional, which yields a value £'(7) G [0, 00] for a given strategy, 7. An 
m-dimensional vector, £(7), is defined as 

L(7) = (L 1 (7),L 2 (7),...,L m ( 7 )). (7.28) 

Two strategies, 7 and 7' are called equivalent if £(7) = L{^'). A strategy 7 is 
said to dominate a strategy 7' if they are not equivalent and ^(7) < ^(7') for all 
i such that 1 < % < m. A strategy is called Pareto optimal if it is not dominated 
by any others. Since many Pareto-optimal strategies may be equivalent, the task 
is to determine one representative from each equivalence class. This will be called 
finding the unique Pareto-optimal strategies. For the example in Figure 7.50, 
there are two unique Pareto-optimal strategies, which were already given. 
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Scalarization For the motion planning problem, a Pareto-optimal solution is 
also optimal for a scalar cost functional that is constructed as a linear combination 
of the individual costs. Let a±, . . ., a m be positive real constants. Let 



It can be shown that any strategy that is optimal with respect to (7.29) is also a 
Pareto-optimal solution [461]. If a Pareto optimal solution is generated this way 
however, there is no easy way to determine what alternatives exist. 

Computing Pareto-optimal strategies Since optimization for one robot is 
already very difficult, it may not be surprising that computing Pareto-optimal 
strategies is even harder. For some problems, it is even possible that a continuum 
of Pareto-optimal solutions exist [], which is very discouraging. Fortunately, for 
the problem of coordinating robots on topological graphs, as considered in Section 
7.2.2, there is only a finite number of solutions. An efficient grid-based algorithm, 
which based on dynamic programming and computes all unique Pareto-optimal 
coordination strategies is presented in [461]. For special cases that involve polyg- 
onal robots moving on a tree of piecewise-linear paths, complete algorithms are 
presented in []. 



m 




(7.29) 



i=i 
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Literature 

Ref. modular robotics, and also Ghrist's work (esp. Ghrist, Abrams). 
A survey of coverage planning appears in [161] 

Extensions of the spanning tree method, especially for on-line problems: [270]. 
Earlier work on this topic [586]. 

Give more computational biology references. There are many problems that 
are loosely related to motion planning. 

Exercises 

1. To yield polyhedral obstacles for time- varying motion planning, what is the 
general form for which linear geometric primitives Hi that define O can be 
transformed? To yield semi-algebraic models? 

2. Give a method for computing the obstacle region for two translating polyg- 
onal robots that follow a linear path. 

3. Construct the cube complex for some examples... 

4. Try numerical continuation a surface in R 3 . 



Chapter 8 



Feedback Motion Strategies 



Chapter Status 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



Up to now, it has been assumed that the robot motions are completely pre- 
dictable. In many applications, this assumption is too strong. A collision-free 
path might be insufficient as a representation of a motion strategy. This chapter 
addresses the problem of computing a motion strategy that uses feedback. Dur- 
ing execution, the action taken by the robot will depend only on the measured 
configuration or state. 

8.1 Feedback in Discrete Planning 

8.2 Vector Fields on Manifolds 

8.3 Feedback Strategies in Motion Planning 

If a path is insufficient, what form should a motion strategy take? Suppose that a 
world with a robot and obstacles is defined. This leads to the definition of config- 
uration space, C, and its collision-free subset C/ ree . Suppose the goal configuration 
is q g0 ai- We might also consider a goal region C goa i. 

One possible representation of a feedback motion strategy is a velocity field, 
V over C. At any q G C, a velocity vector V(q) is given, which indicates the 
how the configuration should change. A successful motion strategy is one in 
which the velocity field, when integrated from any initial configuration, lead to a 
configuration in C goa i. 
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For problems in Section 15, a feedback motion strategy can take the form of 
a function C — > U, in which U is the set of inputs, applied in the state transition 
equation, x = f(x,u). However, nonholonomic feedback motion strategies will 
not be considered in this chapter. 

vector field characterization 

action bundle characterization 



8.3.1 Navigation Functions 

potential function, navigation function characterization 

Connect this explanation back to cost-to-go functions from Chapter 2. 

One convenient way to generate a velocity field is through the gradient of a 
scalar- valued function. Let E : C — > K. denote a real- valued, differentiable potential 
function. Using E, a feedback motion strategy can be defined as V = —S7E, in 
which V denotes the gradient. If designed appropriately, the potential function 
can be viewed as a kind of "ski slope" that guides the robot to a specified goal. 

As a simple example, suppose C = R 2 , and that there are no obstacles. 
Let (x,y) denote a configuration. Suppose that the goal q goa i = (0,0). A 
quadratic function E(x,y) = x 2 + y 2 serves as a good potential function to 
guide the configuration to the goal. The feedback motion strategy is defined 
as V = -VE = [-2x - 2y\. 

If the goal is at any (x ,y ), then a potential function that guides the config- 
uration to the goal is E(x, y) — (x — x ) 2 + (y — yo) 2 - 

Suppose the configuration space contains point obstacles. The previous func- 
tion E can be considered as an attractive potential because the configuration is 
attracted to the goal. One can also construct a repulsive potential that repels the 
configuration from the obstacles to avoid collision. If E a denotes the attractive 
component, and E r denotes the repulsive potential, then a potential function of 
the form E — E a + E r can be defined to combine both effects. The robot should 
be guided to the goal while avoiding obstacles. The problem is that there is no 
way in general to insure that the potential function will not contain multiple local 
minima. The configuration could become trapped at a local minimum that is not 
in the goal region. 

Rimon and Koditschek [659] presented a method for designing potential func- 
tions that contain only one minimum, which is precisely at the goal. These special 
potential functions are called navigation functions. Unfortunately, the technique 
applies only applies when C/ ree is of a special form. In general, there are no known 
ways to efficiently compute a potential function that contains only one local min- 
imum which is at the goal. This is not surprising given the difficulty of the basic 
path planning problem. 

cost-to-go functions 
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8.4 Combinatorial Algorithms 

8.4.1 Harmonic Functions 

8.4.2 Feedback Strategies over Complexes 

Cell based conversions 

Star-shaped regions: Rimon and Koditschek 

8.5 Sampling-Based Algorithms 

8.5.1 Sampling-Based Neighborhood Graph 

This is pasted from a paper of ours; it definitely needs to be substantially shortened 
and written from a more general perspective 

Sampling-based techniques can be used to compute a navigation function (a 
potential function with one local minimum, which is at the goal) over most of 
Cf ree . One method presented here is called the Sampling-Based Neighborhood 
Graph (SNG). 

A Sampling-Based Neighborhood Graph (SNG) is an undirected, graph, G = 
(V,E), in which V is the set of vertices and E is the set of edges. Each vertex 
represents an n-dimensional neighborhood that lies entirely in C/ ree . In this paper, 
an n-dimension ball is used. For any vertex, v, let c v denote the center of its 
corresponding ball, r v denote the radius of its ball, and let B v be the set of points, 
B v = {q G C | \\q — c v \\ < r v }. We require that B v C Cf ree . The definition of B v 
assumes that C is an n-dimensional Euclidean space; however, minor modifications 
can be made to include other frequently-occurring topologies, such as M. 2 x S* 1 and 
R 3 x P 3 . 

An edge, e G E, exists for each pair of vertices, Vi, and Vj, if and only if their 
balls intersect, Bi n Bj ^ 0. Assume that no balls are contained within another 
ball, Bi <2 Bj, for all Vi and Vj in V. Let B represent the subset of C/ ree that 
is occupied by balls, B = \J veV B v . Suppose that the graph G has been given; 
an algorithm that constructs G is presented in Section 8.5.1. For a given goal, 
the SNG will be used to represent a feedback strategy, which can be encoded as 
a real-valued navigation function, 7 : B — > R. This function will have only one 
minimum, which is at the goal configuration. If the goal changes, it will also be 
possible to quickly "reconfigure" the SNG to obtain a new function, 7', which has 
its unique minimum at the new goal. 

Let G be a weighted graph in which /(e) denotes the cost assigned to an edge 
e e E. Assume that 1 < 1(e) < 00 for all e G E. The particular assignment of 
costs can be used to induce certain preferences on the type of solution (e.g., maxi- 
mize clearance, minimize distance traveled). Let B v denote any ball that contains 
the goal, q goa i, and let v g be its corresponding vertex in G. Let L*(v) denote be 
the optimal cost in G to reach v g from v. The optimal costs can be recomputed 
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Figure 8.1: The negative gradient of a partial navigation function sends the robot 
to a lower-cost ball. 

for each vertex in V in 0(V 2 ) or 0(VlgV + E) time using Dijkstra's algorithm; 
alternatively, an all-pairs shortest paths algorithm can be used to implicitly define 
solutions for all goals in advance. 

Assume that G is connected; if G is not connected, then the following dis- 
cussion can be adapted to the connected component that contains v g . Define a 
strict linear ordering, < v , over the set of vertices in V using L*(v) as follows. If 
L*(vi) < L*(v 2 ) for any v 1 ,v 2 G V, then v 1 < v v 2 . If L*(vi) = L*(v 2 ), then the 
ordering of V\ and v 2 can be defined in an arbitrary way, while ensuring that < v 
remains a linear ordering. The ordering < v can be adapted directly to the set of 
corresponding balls to obtain an ordering <b such that: B Vl B V2 if and only if 
V\ < v v 2 . Note that the smallest element with respect to <b always contains the 
goal. 

For a given goal, the SNG will be used to represent a mapping 7 : B — > R 
that serves as a global potential or navigation function. For each vertex, v G V, 
let 7^ : B v — > R represent a partial strategy. Among all balls that intersect B v , 
let B Vm denote the ball that is minimal with respect to <^. It is assumed that j v 
is a differentiable function that attains a unique minimum a point in the interior 
of B v fl B Vm . Intuitively, each partial strategy guides the robot to a ball that has 
lower cost. 

The partial strategies are combined to yield a global strategy in the following 
way. Any configuration, q G B, will generally be contained in multiple balls. 
Among these balls, let B v be the minimal ball with respect to <{, that contains 
q. The navigation function at q is given by j v (q), thus resolving any ambiguity. 
Note that the robot will typically not reach the minimum of a partial strategy 
before "jumping" to a ball that is lower with respect to <*,. 

SNG Construction Algorithm An outline of the SNG construction algorithm 
follows: 



GENERATE_SNG(a,P c ) 
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1 G.init(qw); 

2 while (TerminationUnsatisfied(G,a,-P c ) do 

3 repeat 

4 q new <- 

5 d <— DistanceComputation(g new ); 

6 until ((d > 0) and (q new 23)) 

7 r <— ComputeRadius(d); 

8 Wne«j ^G.AddVertex(g new ,r); 

9 G.AddEdges(v neu) ); 

10 G.DeleteEnclaves(); 

11 G.DeleteSingletonsQ; 

12 Return G 

The inputs are a G (0, 1) and P c G (0, 1) (the obstacle and robot models are 
implicitly assumed). For a given a and P c , the algorithm will construct an SNG 
such that with probability P c , the ratio of the volume of B to the volume of Cf ree 
is at least a. 

Each execution of Lines 3-9 corresponds to the addition of a new ball, B Vnew , to 
the SNG. This results in a new vertex in G, and new edges that each corresponds to 
another ball that intersects B Vnew . Balls are added to the SNG until the Bayesian 
termination condition is met, causing TerminationUnsatisfied to return FALSE. 
The Bayesian method used in the termination condition is presented in Section 
8.5.1. The repeat loop from Lines 3 to 6 generates a new sample in Cf ree \B, which 
might require multiple iterations. Collision detection and distance computation 
are performed in Line 5. Many algorithms exist that either exactly compute or 
compute a lower bound on the closest distance in W between A and O [492, 556, 
639], d(q new ) = mm aeA ( qnew ) min oe0 \\a - o\\. If d is not positive, then q new is in 
collision, and another configuration is chosen. The new configuration must also 
not be already covered by the SNG before the repeat loop terminates. This forces 
the SNG to quickly expand into C/ ree , and leads to few edges per vertex in G. 

Distance computation algorithms are very efficient in practice, and their ex- 
istence is essential to our approach. The distance, d, is used in Line 7 by the 
ComputeRadius function, which attempts to select r to create the largest possible 
ball that is centered at q new and lies entirely in C/ ree . A general technique for 
choosing r is presented in Section 8.5.1. 

The number of iterations in the while loop depends on the Bayesian termina- 
tion condition, which in turn depends on the outcome of sampled events during 
execution and the particular C/ ree for a given problem. The largest two compu- 
tational expenses arise from the distance computation and the test whether q new 
lies in B. Efficient algorithms exist for both of these problems. 

Radius selection For a given q new , the task is to select the largest radius, 
r, such that the ball B v = {q G C \ \\q ne w — q\\ < r} is a subset of Cf ree . If 
DistanceComputation(g neu) ) returns d, then max ae ^ \\a(q new ) — a(q)\\ < d for all 
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q G B v implies that B v C Cf ree . For many robots one can determine a point, a/, 
in A that moves the furthest as the configuration varies. For a rigid robot, this is 
the point that would have the largest radius if polar or spherical coordinates are 
used to represent A. The goal is to make r as large as possible to make the SNG 
construction algorithm more efficient. The largest value of r is greatly affected by 
the parameterization of the kinematics. For example, if a/ is far from the origin, 
points on the robot will move very quickly as the rotation angle changes. 

Although many alternatives are possible, one general methodology for selecting 
r for various robots and configuration spaces is to design a parameterization by 
bounding the arc length. Let / : R n — > M. m denote the expression of the kinematics 
that maps points from an n-dimensional configuration space to an mD world. In 
general, arc length in the world, based on differential changes in configuration, is 
specified by a metric tensor. If the transformation / is orthogonal, the arc length 
is 




in which each term represents the squared magnitude of a column in the Jacobian 
of /. Using the bound ^ J ds 2 < d, (8.1) expresses the equation of a solid ellip- 
soid in the configuration space. Obviously, that solid ellipsoid will be significiently 
different according to different kinematic expressions. The key is to choose kine- 
matic expressions that keep the eccentricity as close as possible to representing a 
sphere. 

For a 2D rigid robot with translation and rotation, C — M. 2 x S 1 , let r rn = 
||a/(0)||. If the standard parameterization of rotation was used, the effects of 
rotation would dominate, resulting in a smaller radius, r = d/r m . But if a scaled 
rotation, q 3 = r m 9, is used, (8.1) will yield that r = d, which is a sphere. Although 
the relative fraction of S 1 that is covered is the same in either case, the amount of 
M? that is covered is increased substantially. For a 3D rigid robot with translation 
and rotation, C = R 3 x P 3 , the same result can be obtained if roll, pitch, and yaw 
are used to represent rotation. The reason for not using quaternions is because 
(8.1) will not yield a simple expression. For problems that involve articulated 
bodies, it is preferable to derive expressions that consider the distance in the 
world of each rigid body. 

A Bayesian termination condition The above algorithm decides to termi- 
nate based on a statistical estimate of the fraction of Cf ree that is covered by the 
SNG. The volumes of Cf ree and B, denoted by /x(C/ ree ) and /jl(B) are assumed 
unknown. Although it is theoretically possible to incrementally compute fi(B), 
it is generally too complicated. A Bayesian termination condition can be derived 
based on the number of samples that fall into B, as opposed to C/ ree \ B. For a 
given a and P c , the algorithm will terminate when 100a percent of the volume of 
Cf ree has been covered by the SNG with probability P c . 
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Let p(x) represent a probability density function that corresponds to the frac- 
tion fi(B) / fi(C f ree ) . Let yi,V2, ■ ■ ■ ,Vk represent a series of k observations, each 
of which corresponds for a random configuration, drawn drawn from C/ ree . Each 
observation has two possible values: either the random configuration, q new , is in B 
or in Cf r ee \ B. Let y& = 1 denote g neu) G £>, and let yk = denote q new G C/ ree \ B. 

For a given a and P c , we would like to determine whether P[x > a] > P c . 
Assume that the prior p(x) is a uniform density over [0, 1]. By iteratively applying 
Bayes' rule, for a chain of k successive samples we have P[x > a] = 1 — a k+1 . 

The algorithm terminates when the number of successive samples that lie in B 
is k, such that a k+l < 1—P C - One can solve for k and the algorithm will terminate 
when k = ln — 1. During execution, a simple counter records the number of 
consecutive samples that fall into B (ignoring samples that fall outside of C/ ree ). 

Some Computed Examples Figure 8. 2. a shows the balls of the SNG for a 
point robot in a 2D environment. Figure 8.2.b shows the SNG edges as line 
segments between ball centers. The SNG construction required 23s, and the al- 
gorithm terminated after 500 successive failures (k = 500) to place a new ball. 
The SNG contained 535 nodes, 525 of which are in a single connected component. 
There were 1854 edges, resulting in an average of only 3.46 edges per vertex. We 
have observed that this number remains low, even for higher-dimensional prob- 
lems. This is an important feature for maintaining efficiency because of the graph 
search operations that are needed to build navigation functions. 

Figures 8.2.c and 8.2.d show level sets of two different potential functions that 
were quickly computed for two different goal (each in less than 10ms). The first 
goal is in the largest ball, and the second goal is in the upper right corner. Each 
ball will guide the robot into another ball, which is one step closer to the goal. 
Using this representation, the particular path taken by the robot during execution 
is not critical. For higher-dimensional configuration spaces, we only show robot 
trajectories, even though much more information is contained in the SNG. 

8.6 Computing Optimal Feedback Strategies 

Numerical dynamic programming: refer back to last section of Chapter 7, and 
Section 2.4. 

Give standard iterative technique 

Then give Dijkstra-like algorithm 

Maybe also give wavefront propagation (Dial's alg) 




Figure 8.2: (a) The compupted neighborhoods for a 2D configuation space; (b) 
the correponding graph superimposed on the neighborhoods; (c), (d) the level sets 
of two navigation functions computed from a single SNG. 
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Overview of Part III: Decision-Theoretic Plan- 
ning 

Planning under Uncertainty 

Just as in Part II, it also seems appropriate to give two names to Part III: 1) 
decision-theoretic planning, and 2) planning under uncertainty, explain... 
refer back to computation models of Chapter 1 

Chapter 9 addresses the problem of how to model and solve the problem of 
making a single decision while facing uncertainties. No state space is necessary in 
this case because the "plan" in this case has only one step. One purpose of the 
chapter is to introduce and carefully evaluate the assumptions that are typically 
made in different forms of decision theory. This forms the basis of more com- 
plicated problems that follow, especially sequential decision making and control 
theory. 

Chapter 10 extends the tools from Chapter 9 from a single decision to multiple 
decisions. In this state space is needed once again, and the problems can 

be considered as a generalizations of the discrete planning problems of Chapter 
2. It is assumed that the state can always be perfectly sensed; however, there are 
uncertainties about what future states will occur. 

Chapter 11 introduces perhaps the most important concept of this book: the 
information space. If there is uncertainty in sensing the current state, then the 
planning problem naturally lives in the information space. An analogy can be 
made to the configuration space and motion planning. Before efforts to unify mo- 
tion planning by using configuration space concepts [437, 504, ?], most algorithms 
were developed on a case by case basis, especially for robot manipulators and 
mobile robots, which appear to have very different characteristics in the world. 
However, once viewed in the configuration space, it is possible to construct general 
algorithms, such as those from Chapters 6 and 5. 

A similar kind of unification should be possible for planning problems that 
involve sensing uncertainties (i.e., are unable to determine the current state). 
Presently, the literature appears to be mostly on a case by case basis, as basic 
motion planning once was. Therefore, it is difficult to provide a prespective as 
unified as the techniques in Part I. Nevertheless, the concepts from Chapter 11 
are used to provide a unified introduction to many planning problems that involve 
sensing uncertainties in Chapter ??. Just as in the case of configuration space, 
some effort is required to learn the information space concepts, but it will pay 
great dividends if the investment is made. Honestly! 
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Basic Decision Theory 



Chapter Status 



A 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



These are class notes from CS497 Spring 2003, parts of which were scribed by 
Steve Lindemann, Shai Sachs. 

9.1 Basic Definitions 

To introduce some of the basic concepts in single-stage decision making, consider 
the following scenario: 

Scenario 1. Let U be a set of possible choices: {ui,U2, ■ ■ ■ ,u n }. 

2. Let L : U — > R be a loss function or cost function. 

3. Select a u G {/ that minimizes L(U). 

In this scenario, we see that the set U consists of all choices that we can make; these 
are also called actions or inputs. The loss function L represents the cost associated 
with each possible choice; another approach is to define a reward function R which 
represents the gain or benefit of each choice. These approaches are equivalent, 
since on can simply take R(u) = —L{u). 

A method used to make a decision is called a strategy. In this scenario, our 
strategy was deterministic; that is, given some set U and function L, our choice is 
completely determined. Alternatively, we could have taken a randomized strategy, 
in which our decision also depended on the outcome of some random events. 
In this strategy, we define a function p : U — > R such that the probability of 
selecting a particular choice u is p(u); denote p(ui) = p%. The ordinary rules 
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governing probability spaces apply (e.g., Y^i=iPi = 1->Pi ^ Randomized 
and deterministic strategies are also called mixed and pure, respectively. For 
purposes of notation, we will use u* to refer to a randomized strategy and U to 
refer to the set of all randomized strategies. 

Example 9.1.1 Let the input set U = {a, b}. Then one can choose a randomized 
strategy u* in the following way: 

1. Flip a fair H/T coin. 

2. If the result is H, choose a; if T, choose b. 

Since the coin is fair, this corresponds to choosing p(a) = 0.5, p(b) = 0.5. 
Consider the following scenario: 

Scenario 1 1. U — {ui, 112, ■ ■ ■ , u n } 

2. L:U^R 

3. Select u* EU that minimizes E[L] = Y^i=i ^( u i)Pi- 

E[L] reflects the average loss if the game were to be played many times. Now, 
Scenarios and 1 are identical, with the exception that one uses a deterministic 
strategy, and one uses a randomized strategy. Which is better? To help answer 
this, we give the following example: 

Example 9.1.2 Let U = {1,2,3}, and L(l) = 2, L(2) = 3, L(3) = 5 (we may 
write this in vector notation as L = [2 3 5]). Following the deterministic strategy 
from Scenario 0, we choose u — 1. What if we use the strategy from Scenario 1? 
By inspection we can see that we need p — [1 0] ; thus, the randomized strategy 
results in the same choice as the deterministic one. 

We have seen in the above example that a randomized strategies and determin- 
istic ones can produce identical results. However, what if for some input set U and 
loss function L, we have L{ui) = L{uj)l Then, there can be randomized strategies 
which act differently than deterministic ones. However, if one only considers the 
minimum loss attained, they are not better because both types of strategies will 
select actions resulting in minimum loss. Thus, in this case we find that Scenario 

1 is useless! However, randomized strategies are very useful in general, as shown 
in the following example. 

Example 9.1.3 (Matching Pennies) Consider a game in which two players 
simultaneously choose H or T. If the outcome is HH or TT (the players choose the 
same), then Player 1 pays Player 2 $1; if the outcome is HT or TH, then Player 

2 pays Player 1 $1. What happens if Player 1 uses a deterministic strategy? If 
Player 2 can determine what that strategy is, then he can choose his strategy so 
that he always wins the game. However, if Player 1 chooses a randomized strategy, 
he can at least expect to break even (what randomized strategy guarantees this?). 
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So far, we have examined scenarios in which there were only a finite number of 
possible choices. Many problems, however, have a continuum of choices, as does 
the following: 

Scenario 2 1. U C R d (usually, U is closed and bounded) 

2. L:U^R 

3. Select u G U to minimize L 

This is a classical optimization problem. 

Example 9.1.4 Let U = [—1, 1] Cl and L(u) = u 2 . To attain minimum cost 
we choose u — 0. 

However, what if in the example above we chose U = (0, 1)? Then the minimum 
is not well-defined. However, we can introduce the concept of the infimum, which 
is the greatest lower bound of a set. Similarly, we can introduce the supremum, 
which is the least upper bound of a set. Then, we can still say infL(-u) = 0. 



9.2 A Game Against Nature 

In the previous scenarios, we have assumed complete knowledge about the loss 
function L. This need not be the case, however; in particular situations, there 
may be uncertainty involved. One convenient way to describe this uncertainty 
is to introduce a special decision-maker, called nature. Nature is an unreasoning 
entity (i.e., it is not an adversary), and we do not know what decision nature will 
make (or has made). We call the set the set of choices for nature (alternatively, 
the parameter space), and 9 G is a particular choice by nature. The parameter 
space may be either discrete or continuous; in the discrete case, we have = 
82, ■■■ , n }, and in the continuous case we have C R d . Then, we can define 
the loss function to be L : U x — > 1R, in which the operator • x • is the Cartesian 
product. 

Example 9.2.1 Let L be specified by the following table: 



U 







1 


-1 


2 


-1 


2 


-1 





-2 


1 



The best strategy to adopt depends on what model we have of what nature will 
do: 

• Nondeterministic: I have no idea. 

• Probabilistic: I have been observing nature and gathering statistics. 
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In the first case, one might assume Murphy's Law ("If anything can go wrong, it 
will"); then, one would choose the column with the least maximum value. Alter- 
natively, one might assume that nature's decisions follow a uniform distribution, 
all choices being equally likely Then one would choose the column with the least 
average loss (this approach was taken by Laplace in 1812). In the second case, 
one could use Bayesian analysis to calculate a probability distribution P(9) of the 
actions of nature, and use that to make decisions. The following two scenarios 
formalize these approaches. 

Scenario 3 (Minimax solution) 1. U — {u± : . . . ,u n } 

2. e = {e 1 ,...,e m } 

3. L:UxQ^R 

A. Choose u to minimize maxL(u,9). 

6»ee 

Scenario 4 (Expected optimal solution) 1. U — {u\, . . . ,u n } 

2. Q = {9 1 ,...,9 m } 

3. P{9) given V0 G 
A. L:UxQ^R 

5. Choose u to minimize E e [L] = Xleee L{u ) 9)P{9). 

Again consider Example 9.2.1. If the strategy from Scenario 3 is adopted, then 
we would choose U\ so that we would pay loss 1 in the worst case. If the strategy 
from Scenario 4 is chosen, and assuming P{9\) = 1/5, P(9 2 ) = 1/5, P(9 3 ) = 3/5, 
we find that u 2 has the lowest expected loss, and so would take that action. If the 
probability distribution had been P = [1/10 4/5 1/10], then simple calculations 
show that u\ is the best choice. Hence our decision depends on P(9); if this 
information is statistically valid, then better decisions are made. If it is not, then 
potentially worse decisions can be made. 

Another strategy is to minimize "regret" , the amount of loss you could have 
eliminated if you had chosen differently, given the action of nature. A regret 
matrix corresponding to Example 9.2.1 can be found in Figure 9.1. Given some 
regret matrix, one can adopt a minimax or expected optimal strategy. 

9.2.1 Having a single observation 

Let y be an observation; this could be some data, a measurement, or a sensor 
reading. Let Y be the observation space, the set of all possible y. Now, we can 
make a decision based on y; let 7 : Y — > U denote a decision rule (strategy, plan). 
Then modify our decision strategies as follows: 
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Figure 9.1: A regret matrix corresponging to Example 9.2.1. 

• Nondeterministic: Assume there is some F(y) C 0, which is known for 
every y G F. Choose some 7 that minimizes max L(j(y), 6) for each y G Y . 

• Probabilistic: Assume that P(y\9) is known, G Y,V# G 0. Then Bayes 
rule yields P(9\y) = P(y\9)P(9)/P(y), in which P(y) = Zeee WW) 1 
Then choose 7 so that it minimizes the conditional Bayes risk R(u\y) = 
£ eee L( M ,0)P(%), for every yG Y. 

Formally, we have the following scenarios: 
Scenario 5 (Nondeterministic) 1. U — {ui, . . . ,u n } 

2. e = {0i,...,0 m } 

3. Y = { yi ,..., yi } 

4- F{y) given Wy G Y 

5. L:UxQ^R 

6. Choose 7 to minimize max L(j(y),9) for each y G F. 

eeF(y) 

Scenario 6 (Bayesian decision theory) 1. U — {u\, . . . ,u n } 

2. Q = {9 1 ,...,9 m } 

3. Y = { yi ,..., yi } 

4. P{9) givenVO G 0. 

5. P(y\9) given Vy eY,9 G 

6. L:UxQ^R 

7. Choose 7 to minimize R{ r y{y)\y) for every y G F. 

Extending the former case, we may imagine that we have k observations: 
yi,...,y k . Then, R(u\yi, . . . ,y k ) = Y^eeo L ( u > • • • , Vk)- If we assume 

that P(y.i\6) is known for each % G {1, . . . , k} and that conditional independence 

holds, we have P(9\ Vl , ...,y k )= (uti P(Vi\8)) P(0)/P(yi, • • • , Vk)- 

1 For the purposes of decision-making, P(y) is simply a scaling factor and may be omitted. 
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Figure 9.2: An overview of decision theory. 



9.3 Applications of Optimal Decision Making 

An overview of the field of decision theory and its subfields is pictured in Figure 



9.3.1 Classification 

Let Q = {u>i, . . . , u) n } denote a set of classes, and let y denote a feature and Y a 
feature space. For this type of problem, we have 6 = U — Q, since nature selects 
an object from one of the classes, and we attempt to identify the class nature has 
selected. The feature set Y represents useful information that can help us identify 
which class an object belongs to. 

The basic task of classification can be described as follows. We are given y, a 
feature vector, where y G Y, and Y is the set of all possible feature vectors. The 
set of possible classes is Q. Given an object with a feature vector y, we wish to 
determine the correct class uo G f2 of the object. 

Ideally, we are given P{y\uo) and P{oo), the prior distribution over the classes. 
The probability P(ou) gives the probability that an object falls in the class uo. 

A reasonable cost function is 



If the Bayesian decision strategy is adopted, it will result in choices that minimize 
the expected probability of misclassification. 



9.2. 




if u = 9 (the classification is correct) 

1 if u 7^ 9 (the classification is incorrect) 



Example 9.3.1 (Optical Character Recognition) Let Q = {A, B,C, D, E, F,G, H}. 
Further, imagine that we our image processing algorithms can extract the follow- 
ing features: 
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Assuming that the image processing algorithms never err, we can use a minimax 
strategy to make our decision. Are there any letters which the features do not 
distinguish? If so, what enhancements might we make to our image processing 
algorithms to distinguish them? If we assume that that the image processing 
algorithms sometimes make mistakes, then we can use a Bayesian strategy. After 
running the algorithms thousands of times and gathering statistics, we can learn 
the necessary conditional probabilities and use them to make the decision with 
the highest expectation of success. 

9.3.2 Parameter Estimation 

One subfield of Decision Theory is parameter estimation. The goal is to estimate 
the value of some parameter, given some observation of the parameter. We con- 
sider the parameter to be some fixed constant, and denote the set of all possible 
parameters (the parameter space) as X. 

Using our notation from decision theory, we have Q = X. The parameter we 
are trying to estimate is nature's choice. 

Since the goal is the guess the correct value of the parameter, the set of actions 
U is also equal to X; that is, the human player's action is to choose some valid 
value of the parameter as her guess. Therefore, we have X = = U. 

Further, we have an observation about the parameter, y; we denote the set of 
all possible observations by Y . Clearly, X CY. 

Suppose we have X = [0, 1] C R, p(y\x) = ^/=f exp ^ ~ ^ ) , an d p{x) = 1. 

We interpret these probability density functions as follows: p{y\x) tells us that 
there is some Gaussian noise in the observation (that is, our observations of the 
parameters, over many trials, will be concentrated around the true parameter in a 
Gaussian distribution); further, p(x) tells us that each parameter is equally likely. 

Finally, we choose a loss function which measures the estimation error (that 
is, the difference between our estimate and the true parameter). We use L(u, x) = 
[u — x) 2 . 

We wish to choose the input u which minimizes our risk; we therefore choose 
u which minimizes 
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R(u) = / L(u, x)p(y\x)p(x)d: 



x 



(9.1) 



X 



Note that when Equation 9.1 is multiplied by -pr, by Bayes' rule we have 



Then the expression for R(u) becomes exactly analogous to the discrete form of 
the risk function from our previous lecture. Since p(y) is constant over X, we may 
remove it from the integral without affecting the correct choice of u. Therefore 
R{u) in Equation 9.1 is not exactly the risk function, but it is closely related. 



Utility theory asks: Where does the loss function L come from? In other words, 
how useful is one loss compared against another? 

In utility theory we replace "loss" with "reward"; the human wants to maxi- 
mize reward. Note that this convention can easily be inverted to return the the 
usual "loss" convention in decision theory. 

9.4.1 Choosing a Good Reward 

Let Mi be some fairly unpleasant task (such as writing scribe notes), and u 2 be 
the act of doing nothing. We consider the problem of choosing a good reward 
function using the following examples. 

1. Let R(ui) = 1000, and R(u 2 ) = 0. For a poor graduate student, it may 
be worthwhile to write scribe notes for $ 1000, so the student will probably 
do u\ in this scenario. One difficulty in this scenario is that we haven't 
considered the possible cost to the human of each action. 

2. Let.R(ui) = 10001000, &nd R(u 2 ) = 10000000. Although the relative reward 
is the same, the action chosen is probably different! This is so because the 
value (or utility) of money decreases as we have more of it. 

3. Let R{ui) = 10000 and R(u 2 ) = 25,000 with probability |, and with 
probability \. In this scenario some conservative students may choose u±, 
to guarantee a reward; while more adventurous gamblers may choose u 2 , as 
the expected gain is greater. 

4. Let R{ui) = 100, and R(u 2 ) = 250 with probability |, and otherwise; 
allow the student to choose an action (and collect the corresponding reward) 
100 times. The expected reward for each action remains the same, but we 




dx 
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Figure 9.3: 

imagine more students will choose u 2 100 times than those that will choose 
ui 100 times. This is so because "repeatability" is important in games of 
expectation; that is, the true outcome of a game is more likely nearer the 
expected value if the number of trials increases. 

The goal of utility theory is to construct reward functions that give the right 
expected value for a game, given the preferences of the human player. 

We call the set of all possible rewards for a given game the reward space, and 
denote it by 1Z. For example, in scenario 3, we have TZ = {0, 10000, 25000}. 

Consider game with nature depicted in Figure 9.3. Suppose P(9i) = \ and 
P{® 2 ) — f • Then choosing Ui implies we get R(ui) = 1 with probability |, and 
R(ui) = 3 with probability |. We may consider the prior distribution over as 
giving us a probability distribution over TZ. 

In general, we let V be the set of all probability distributions over TZ, and 
let P G V be one such distribution. We expect the human player to express her 
preference between any two such probability distributions, and we denote these 
preferences with the usual inequality operations. Thus Pi < P 2 indicates that the 
human prefers Pi no more than P 2 , P\ = P 2 indicates that the human has no 
preference among P\,P 2 , and so on. 

We may then express the goal of utility theory as follows: we wish to find some 
function V : TZ i-> R such that Pi < P 2 iff E Pl [V(r)] < E p *[V{r)}. That is, the 
expected value of a reward is greater under more preferred distributions over the 
reward space. 

Note that computing V is difficult. However, we know (but will not prove) 
that V exists when the human is rational - that is, when her choices obey the 
Axioms of Rationality. 

9.4.2 Axioms of Rationality 

We say that a human is rational when the preferences she expresses among prob- 
ability distributions over TZ obey the following axioms. 

1. If Pi, P 2 G V, then either P l < P 2 or P 2 < P v 

2. If Pi < P 2 and P 2 < P 3 then Pi < P 3 . 



3. If Pi < P 2 then aPj + (1 - a)P 3 < aP 2 + (1 - a)P 3 , for all P 3 G V and 
a G (0,1). 
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Figure 9.4: A scenario in which worst-case decision making might yield undesirable 
results. 

This axiom is strange, but it merely means that no matter how much we 
"blend" P\ and P 2 with some other distribution P 3 , we will still prefer the 
"blended" P 2 to the "blended" P x . 

4. If P l <P 2 < P 3 then 3a e (0, 1), f3 e (0, 1) such that aP x + (1 - a)P 3 < P 2 
and P 2 < pPi + (1 - [3)P 3 . 

This axiom means that no matter how good P 3 is, we can always blend a 
bit of it with Pi to get a distribution less preferable than P 2 ; similarly, no 
matter how bad P\ is, we can always blend a bit of it with P 3 to get a 
distribution more preferable than P 2 . 

9.5 Criticisms of Decision Theory 

We consider a few criticisms of decision theory: 

1. The values of rewards are subjective. If they are provided by the human, 
then the process of making a decision may amount to "garbage in, garbage 
out." 

2. It is difficult to assign losses. 

3. Badly chosen loss functions can lead to bad decisions. 

One response to this criticism is sensitivity analysis, which claims that the 
decisions are not hypersensitive to the loss functions. Of course, if this 
argument is taken to far, then the value of decision theory itself is thrown 
into question. 

9.5.1 Nondeterministic decision making 

There are two main criticisms of nondeterministic decision making: first, "worst- 
case" analysis can yield undesirable results in practice; and second, the same 
decisions can often be acquired through Bayesian decision making by manipulating 
the prior distribution. 

Consider the rewards in Figure 9.4. Worst-case analysis causes us to choose 
ui, although in practice we may want to choose U2 if we know the risk of event 
62 is low (equivalently, the probability of 62 must be rather high for the expected 
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Figure 9.5: A comparison of the Bayesian and frequentist interpretations of prob- 
abilities. 

value of the scenario to favor choosing u\.) Further, we can simulate the result of 
the nondeterministic decision as a Bayesian decision by correctly assigning prior 
distributions - for example, by setting P(9i) <C -P(^)- 

9.5.2 Bayesian decision making 

A common criticism of Bayesian decision making centers around the Bayesian 
interpretation of probabilities. We compare the Bayesian and frequentist inter- 
pretations of probabilities in Figure 9.5. 

While frequentists do not incorporate prior beliefs into decisions, they do in- 
corporate observations. Thus, a frequentist risk function might be: 



Prior distributions One problem with this function is that both 6 and 7 are 
unknown. If we are to choose a 7 which minimizes R(9,j), then our choice of 6 
might considerably influence the choice of 7. 

A considerable difficulty for Bayesian decision making is determining the prior 
distributions. One common distribution is the Laplace distribution. Using the 
principle of insufficient reason, this distribution makes each 6 equally likely. 

The Laplace distribution has some justification from information theory. Lack- 
ing any information about O, we may wish to choose the most "non-informative" 
prior - that is, the probability distribution which contains the least information. 

The entropy contained in a probability distribution over O can be computed 
using the Shannon Entropy Equations: 




y 



E 




(9.2) 
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(9.3) 
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Equation 9.2 is for discrete probability mass functions; Equation 9.3 is for 
continuous probability density functions. 

Using Shannon's Entropy Equations, we can show that the probability dis- 
tribution which yields the least information (the highest entropy) is that which 
assigns equal probabilities to all events in 6 - that is, the Laplace distribution. 

The structure of We encounter several problems with the Laplace distribu- 
tion as we consider the structure of 0. 

Suppose = R. The Laplace distribution assigns probability to any bounded 
interval of R. This difficulty is mostly mechanical however; the use of generalized 
probability density functions solves this problem. 

Often, we can structure in arbitrary ways that significantly affect the prior 
distribution. Suppose we let = {61,62}, where #1 indicates "no precipitation" 
and 62 indicates "some precipitation". The Laplace distribution assigns P{6\) = 

m) - 2 . 

Suppose instead we let 0' = {6*1, 6 2 , 6 3 }, where 6\ indicates "no precipitation", 
62 indicates "rain", and 63 indicates "snow". Clearly, 0' describes the same set 
of events as 0. But in this scenario the Laplace distribution assigns P{6i) = 
P{@2) = P{6z) — \- The combined probability of precipitation is |. Which 
characterization of nature is correct - or 0'? 

The following is an interesting practical example of arbitrary choices about 
the structure of 0. Suppose we wish to fit a line to a set of points. The equation 
for the line is 6\x + 62V + #3 = 0. What prior distribution should we choose for 6\, 
62 and 6*3? We could choose to "spread the probability" around the unit sphere, 
by requiring 6\ + 6\ J r6\ = 1. However, this choice is entirely arbitrary; we could 
have also spread the probability around the unit cube, with very different results. 

9.6 Multiobjective Optimality 

For now, we concentrate on multiple-objective decisions with no uncertainty. 
Thus, we have U (the input space), and the loss function L : U 1— > M, d . 

The goal is to find all u e U such that there is no u' e U with L(u') < L(u). 
That is, we wish to compute the set of "minimal" inputs u, using the partial 
ordering <. This set is called the Pareto optimal set of inputs. 

Let L(u) = (Li(u), . . . ,L d {u)). Then we define L(u') < L(u). Then we define 
L(u') < L(u) iff Li(u') < Li(u) for all i. 

Consider the multi-decision problem depicted in Figure 9.6. Two robots, in- 
dicated by hollow circles, wish to travel along the paths designated. Suppose the 
path for and speed for each robot is fixed; the only actions possible are starting 
and stopping at various points in time. Suppose further that the loss function 
computes the time for each robot to travel along its designated path. 

Clearly, many possible inputs are possible. For example, one robot could wait 
until the other robot has reached its goal before starting to move. Alternately, 
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Figure 9.6: A multi-objective decision problem in which, although there are many 
conceivable inputs, there are only two Pareto optimal loss values. 

both robots could move until just before collision, at which point one robot stops 
and the other continues moving; the stopped robot continues moving once collision 
is avoided. Nevertheless, there will only be two Pareto optimal loss values for this 
problem, such as (4,6) and (6,4). 

One problem with Pareto optimality is that it might yield an "optimal" set 
which is identical to U. For example, consider U = [0, 1] and L(u) = (u,l — u). It 
is easy to see that the optimal set for this scenario is just U, since whenever one 
component of L(u) increases, the other decreases by the same amount. 

9.6.1 Scalarizing L 

If we can "scalarize" L, then we can find a single optimal value of L(u), rather 
than many possible optimal values. 

We can scalarize L as follows. Choose a±, . . . , e (0, 1). Let l(u) = ajLj(u). 
Note that l(u) is just the dot product of L(u) and the a vector. 

We can make a multi-objective decision by choosing the u which minimizes 
l(u). It turns out that this u must also be in the Pareto optimal set. Note that it 
is possible that more than one u yields the minimum l{u). 

We might interpret the set of "priorities" over the components of L - 

higher a^s are more important, and higher losses in the corresponding components 
of L(u) should be avoided. 



9.7 Two-Player Zero Sum Games 
9.7.1 Overview of game theory 

In a game, several decision makers strive to maximize their (expected) utility by 
choosing particular courses of action, and each decision maker's final utility pay- 
offs depend on the courses of action chosen by all decision makers. The interactive 
situation specified by the set of participants, the possible courses of action of each 
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decision maker, and the set of all possible utility payoffs, is called a game; the 
decision makers 'playing' a game are called the players. 

Game theory is a set of analytical tools designed to help us understand the phe- 
nomena that we observe when decision-makers interact. The basic assumptions 
that underlie the theory are that decision-makers pursue well-defined exogenous 
objectives (they are rational) and take into account their knowledge or expecta- 
tions of other decision-makers' behavior (they reason strategically). 
Some of the areas of game theory that we are going to look into are: 

• Multiple Decision Makers: There will be two or more decision makers, 
trying to make decisions at the same time. 

• Single stage v Multiple stage 

• Zero sum v Non zero sum games: Zero-sum games are games where the 
amount of "winnable goods" (or resources ) is fixed. Whatever is gained by 
one decision maker, is therefore lost by the other decision maker: the sum 
of gained (positive) and lost (negative) is zero. 

In non-zero-sum games there is no universally accepted solution. That is, 
there is no single optimal strategy that is preferable to all others, nor is there 
a predictable outcome. Non-zero-sum games are also non-strictly competi- 
tive, as opposed to the completely competitive zero-sum games, because such 
games generally have both competitive and cooperative elements. Players 
engaged in a non-zero sum conflict have some complementary interests and 
some interests that are completely opposed. 

• Different Information States for each player: Each player has an infor- 
mation set corresponding to the decision nodes, which are used to represent 
situations where the player may not have complete knowledge about every- 
thing that happens in a game. Information sets are unique for each player. 

• Deterministic v Randomized Strategies: When the player uses a deter- 
ministic or pure strategy, the player specifies a choice from his information 
set. When a player uses a mixed strategy, he plays unpredictably in order 
to keep the opponent guessing. 

• Cooperative v Noncooperative: A player may be interpreted as an 
individual or as a group of individuals making a decision. Once we define 
the set of players, we may distinguish between two types of models: those 
in which the sets of possible actions of individual players are primitives and 
those in which the sets of possible joint actions of groups of players are 
primitives. Models of the first type can be referred to as "noncooperative", 
while those of the second type can be referred to as "cooperative". 

There are two main assuptions in game theory: 

• Players know each other's loss functionals. 
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• Players are rational decision makers. 

The following table summarizes some of the above mentioned features: 



# of players 


# of steps 


Nature ? 


Cost Functionals 


Example 


1 


1 


N 


1 


Classical Optimization 


1 


1 


Y 


1 


Basic Decision Theory 


> 1 


1 


N 


> 1 


Matrix Games 


> 1 


1 


Y 


> 1 


Markov Games (probabilistic) 


1 


>1 


N 


1 


Optimal Control Theory 


1 


>1 


Y 


1 


Stochastic Control 


>1 


>1 


N/Y 


> 1 


Dynamic Game Theory 


1 


1 


N 


> 1 


Multi-objective Optimality 


>1 


>1 


N/Y 


1 


Team Theory 



The most elementary type of two-player zero sum games are matrix games. 
The main features of such games are: 



• There are two players Pi and P2 and an (m x n) dimensional loss matrix A 
= {aij}. 

• Each entry of the matrix is an outcome of the game, corresponding to a 
particular pair of decisions made by the players. 

• For Pi, the alternatives are the m rows of the matrix and for P2, the al- 
ternatives are the n columns of the matrix. These are also known as the 
strategies of the players and can be expressed in the following way: 



U 1 = u\,u\,...,u x n 

tt2 2 2 2 

U = u 1: u 2 ,...,u n 



• Both players play simultaneously. 

• If Pi chooses the ith row and P2 chooses the jth column, then is the 
outcome of the game and Pi pays this amount to P2. In case is negative, 
this should be interpreted as P2 paying Pi the positive amount correspond- 
ing to this entry. 

More formally, for each pair < U} ,Uj > , 

Pi has loss Li(Ul,Uf) and 

P 2 has loss L 2 {Ul Uf) = -L^U}, Uf) 

We can write the loss functional as simply L, where P 1 tries to minimize L 
and P 2 tries to maximize L. 
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Example: Suppose the loss matrix for players Pi and P 2 is as below: 
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> 1 


1 
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> 1 


Markov Games (probabilistic) 
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>1 
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Optimal Control Theory 
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>1 
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Stochastic Control 


>1 


>1 
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> 1 


Dynamic Game Theory 


1 


1 


N 


> 1 


Multi-objective Optimality 


>1 


>1 


N/Y 


1 


Team Theory 



In order to illustrate the above mentioned features of matrix games, let us consider 
the following (3 x 4) matrix. 

P 2 



Pi 



1 


3 


3 


2 





-1 


2 


1 


-2 


2 





1 



In this case, P2, who is the maximizer, has a unique security strategy, "column 
3" (j* = 3), securing him a gain-floor V = 0. Pi, who is the minimizer, has two 
security strategies, "row 2" and "row 3" (i\ — 2, i\ — 3) yielding him a loss ceiling 
of V = maxj a2j = max., a^j = 2 which is above the security level of P 2 . 
We can express this more formally in the following notation: 
Security strategy for Pi = argmini~maxj L{U}, Uj) 
Therefore, loss-ceiling or upper value V = min^ maxj L(U} , Uj) 

Security strategy for P 2 = argmaxi mm,- L(U}, Uj) 
Therefore, gain-floor or lower value V = maxj min^ L(U}, Uj) 



9.7.2 Regret 

If P 2 plays first, then he chooses column 3 as his security strategy and Pi's unique 
response would be row 3, yielding an outcome of V_ = 0. However, if Pi plays 
first, then he is indifferent between his two security strategies. In case he chooses 
row 2, P 2 will respond with choosing column 2 and if Pi chooses row 3, then P 2 
chooses column 2, both strategies resulting in in outcome of V = 2. 
This means that when there is a definite order of play, security strategies of the 
player who acts first make complete sense and they can be considered to be in 
equilibrium with the corresponding response strategies of the other player. By the 
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two strategies being in equilibrium, it is meant that after the game is over and 
its outcome is observed, the players should have no ground to regret their past 
actions. Therefore, in a matrix game with a fixed order of play, for example, there 
is no justifiable reason for a player who acts first to regret his security strategy. 

In matrix games where players arrive at their decisions independently the security 
strategies cannot possibly possess any sort of equilibrium. To illustrate this, we 
look at the following matrix: 



P 2 



Pi 



4 





-1 





-1 


3 


1 


2 


1 



We assume that the players act independently and the game is to be played only 
once. Both players have unique security strategies, "row 3" for Pi and "column 
1" for P2, with the upper and lower values of the game being V — 2 and V_ = 
respectively. If both players play according to their security strategies, then the 
outcome of the game is 1, which is midway between the security strategies of the 
players. But after the game is over, both P x and P 2 might have regrets. This 
indicates that in this matrix game, the security strategies of the players cannot 
possibly possess any equilibrium property. 

9.7.3 Saddle Points 

For a class of matrix games with equal upper and lower values, a dilemma regard- 
ing regret does not arise. If there exists a matrix game where V — V_ — V then 
we say that the strategies are in equilibrium, since each one is optimal against 
the other. The strategy pair (row x, col y), possessing all the favorable features is 
clearly the only candidate that can be considered as the equilibrium of the matrix 
game. 

Such equilibrium strategies are known as saddle point strategies and the ma- 
trix game in question is said to have a saddle point in pure strategies. 

There can also be multiple saddle points as shown in the following figure: 
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9.7.4 Mixed Strategies 

Another approach to obtain equilibrium in a matrix game that does not possess 
a saddle point and in which players act independently is to enlarge the strategy 
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Figure 9.7: Saddle point 

spaces so that the players can base their decisions on the outcome of the random 
events - this strategy is called mixed strategy or randomized strategy. Unlike the 
pure strategy case, here the same game is allowed to be played over and over 
again, and the final outcome, sought to be minimized by Pi or maximized by P2 
is determined by averaging the outcomes of the individual outcomes. 

A strategy of a player can be represented by probability vectors. Suppose the 
strategy for Pi is represented by 

y=\yi,V2,---, Vn\ T where y t > and y { = 1 

and the strategy for P 2 is represented by 

z — [zi, Z2, ■ ■ ■ , z n ] T where Z; L > and z i = 1 

Let A be the loss matrix. Therefore, 
Expected loss for Pi is, 

n m 
i=l j=l 

= y T Az 

Note: Az is the expected losses over nature's choices, given Pi's actions. Az 
makes P 2 look like nature (probabilistic) to P±. 



Expected loss for P 2 is, 
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E[L 2 ] = —E[Li] 

It turns out that we can always find a saddle point in the space of mixed strategies. 

Mixed Security Strategy A vector y*eY is called a mixed security strategy 
for Pi in the matrix game A, if the following inequality holds Vy : 

V m (A) = maxy*'Az < m&xy'Az yeY (9.4) 

zeZ zeZ 

Here, the quantity V m is known as the average security level of Pi. 

Analogously, a vector z*eZ is called a mixed security strategy for P 2 in the matrix 
game A, if the following inequality holds : 

V m (A) = rainy*' Az < miny'Az zeZ (9.5) 

yeY yeY 

Here, the quantity V_ m is known as the average security level of P 2 . 
From eq. (1), we have, 

V m < V (9.6) 

Similarly, from eq. (2): 

V < Y_ m (9.7) 

Therefore, combining eq. (3) and eq. (4), we have: 

V<V m <V m <V (9.8) 

According to Von Neumann, V_ m and V m always equal. So eq. (5) can be written 
as : 

V<V m = V m <V (9.9) 

which essentially means that there always exists a saddle point for mixed strate- 
gies. 

9.7.5 Computation of Equilibria 

It has been shown that a two person zero-sum matrix game always admits a sad- 
dle point equilibrium in mixed strategies. An important property of mixed saddle 
point strategies is that, for each player there is a mixed security strategy and for 
each mixed security strategy there is a corresponding mixed saddle point strategy. 
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Using this property, there is a possible way of obtaining the saddle point solution 
of a matrix game, which can be used to determine the mixed security strategies 
for each player. 

Let us consider the following (2 x 2) matrix game: 

P 2 



3 





-1 


1 



Let the mixed strategies of y and z be defined as follows: 

V = bi,2/2] T 
z = [zi,z 2 ] T 

For Pi, our goal is to find the y* that optimizes y T Az while P2 is trying to do 
his best, i.e. P2 uses only pure strategies. Therefore, P2 can be expected to play 
either (z\ — 1, z 2 — 0) or (zi = 0, z 2 = 1) and under different choices of mixed 
strategies for P 1; we can determine the average outcome of the game as shown in 
Fig 3 by the bold line, which forms the upper envelope to the straight lines drawn. 
Now, if the mixed strategy (y\ = \-,y 2 = §) corresponds to the lowest point of 
that envelope adopted by Pi, then the average outcome will be no greater than 
|. This implies that the strategy (y* = §,2/2 = §) is a mixed security strategy for 
Pi (and his only one), and thereby, it is his mixed saddle point strategy. From 
the figure, we can see that the mixed saddle point value is |. 

In order to find z*, we assume the Pi adopts pure strategies. Therefore for differ- 
ent mixed strategies of P2, the average outcome of the game can be determined to 
be the bold line, shown in Fig. 4, which forms the lower envelope to the straight 
lines drawn. Since P2 is the maximizer, the highest point on this envelope is his 
average security level. This he can guarantee by playing the mixed strategy which 
is also his saddle point strategy. 

Solving matrix games with larger dimensions One alternative to the graph- 
ical solution described above when the dimensions are large (i.e. n x m games) is 
to convert the original matrix game into a linear programming model and make 
use of the powerful algorithms for linear programming in order to obtain the sad- 
dle point solutions. 

This equivalency of games and LP may be surprising, since a LP problem involves 
just one decision-maker, but it should be noted that with each LP problem there 
is an associated problem called the dual LP. The optimal values of the objective 
functions of the two LPs are equal, corresponding to the value of the game. When 
solving LP by simplex-type methods, the optimal solution of the dual problem 
also appears as part of the final tableau. 
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Figure 9.8: Mixed Security strategy for Pi for the matrix game 




zi=l zi*=l/5 zi=0 

Z' A = & Z2 = 4/5 2.2 = 1 



Figure 9.9: Mixed Security strategy for P 2 for the matrix game 
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9.8 Nonzero Sum Games 

The branch of Game Theory that better represents the dynamics of the world 
we live in is called the theory of non-zero-sum games. Non-zero-sum games differ 
from zero-sum games in that there is no universally accepted solution. That is, 
there is no single optimal strategy that is preferable to all others, nor is there 
a predictable outcome. Non-zero-sum games are also non-strictly competitive, 
as opposed to the completely competitive zero-sum games, because such games 
generally have both competitive and cooperative elements. Players engaged in a 
non-zero sum conflict have some complementary interests and some interests that 
are completely opposed. 



9.8.1 Nash Equilibria 

A bi-matrix game is comprised of two (m x n) dimensional matrices A = {a^} 
and B = {bij} where each pair of entries {a^ b^} denote the outcome of the game 
corresponding to a particular pair of decisions made by the players. Being a ra- 
tional decision maker each player will strive for an outcome which provides him 
with the lowest possible loss. 

Assuming that there are no cooperations between the players and the players 
make their decisions independently, we now try to find out a noncooperative equi- 
librium solution. The notion of saddle points in zero sum games is also relevant in 
non zero sum games, where the equilibrium solution is expected to exist if there 
is no incentive for any unilateral deviation for the players. Therefore, we have the 
following definition: 

Definition 3.1 A pair of strategies {row i* , column j* } is said to constitute 
a Nash Equilibrium if the following pair of inequalities is satisfied for all i = 
l,...,m and all] = l,...,n: 

Oi*j* ^ 

bi*j* < b^* 

We use a 2 player, single stage game to illustrate the features of a non zero sum 
game. A and B are the two players, each of them have individual loss functions 
Pi and P 2 respectively. The loss functions are represented by the following two 
matrices: 
For A: 



P 2 

P 
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and for B: 

P 2 



Pi 



2 


3 


1 






It admits two Nash equilibria, {row 1, col 1} and {row 2, col 2}. The corresponding 
Nash equilibria is (1,2) and (-1,0). 



9.8.2 Admissibility 

The previous example shows that a bi-matrix game can admit more than one 
Nash equilibrium solution, with the equilibrium outcomes being different in each 
case. This raises the question whether there is a way of choosing one equilibrium 
over the other. We introduce the concept admissibility as follows: 



Better A pair of strategies {row i 1 , column ji}is said to be better than another 
pair of strategies {row i 2 , column j 2 } if a il j 1 < a i2 j 2 and < b i2 j 2 and if at least 
one of these inequalities is strict. 



Admissibility A Nash equilibrium strategy pair is said to be admissible if there 
exists no better Nash equilibrium strategy pair. 

In the given example, {row 2 , column 2} is the one that is admissible out of 
the two Nash equilibrium solutions, since it provides lower costs for both players. 
This pair of strategies can be described as the most reasonable noncooperative 
equilibrium solution of the bi-matrix game. In the case when a bimatrix game 
admits more than one admissible Nash equilibrium the choice becomes more dif- 
ficult. If the two matrices are as follows: 
For A: 



Pi 



and for B: 



Pi 



-2 


1 


-1 


-1 


F 


to 


-1 


1 


2 


-2 



there are two admissible Nash equilibrium solutions{ row 1, column 1}, {row 2, 
column 2} with the equilibrium outcomes being (-2,-1) and (-l,-2)respectively. 
This game can lead to regret unless some communication and negotiation is pos- 
sible. 
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However if the equilibrium strategies are interchangeable then the ill-defined equi- 
librium solution accruing from the existence of multiple admissible Nash equilib- 
rium solution can be resolved. This necessarily requires the corresponding out- 
comes to be the same. Since zero sum matrix games are special types of bi-matrix 
games (in which case the equilibrium solutions are known to be interchangeable), 
it follows that there exists some non empty class of bi-matrix games whose equi- 
librium solutions possess such a property. More precisely : 

Multiple Nash equilibria of a bimatrix game (A,B) are interchangeable if (A,B) is 
strategically equivalent to (A, -A). 



9.8.3 The Prisoner's Dilemma 

The following example shows how by using Nash's equilibrium, the prisoners can 
achieve results that yield no regrets, but how by cooperating, they could have 
done much better. We show the cost of cooperation and denial of wrong doing in 
form of the following two matrices: 
For A: 



Pi 



and for B: 



Pi 



8 





30 


2 


P 2 


8 


30 





2 



Using Nash equilibrium, the choice is (8,8) which yields no regret for either A or 
B. However, if the prisoners had cooperated then they would have ended up with 
(2,2) which is much better for both of them. 



9.8.4 Nash Equilibrium for mixed strategies 

Nash showed that every non-cooperative game with finite sets of pure strategies 
has at least one mixed strategy equilibrium pair. We define such pair as a Nash 
equilibrium. For a two-player game, where the matrices A and B define the cost 
for players 1 and 2 respectively, the strategy (y*,z*) is a Nash equilibrium if: 



y* T Az* < y T Az* VyeY 
y* T Bz* < y* T Bz VzeZ 

in which Y and Z are the sets of possible mixed strategies for players 1 and 
2 respectively. Remember that the elements of Y and Z are vectors defining 
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the probability of choosing different strategies. For example, for a y G Y, y = 
[yi, y2, y m ] T , we have Y^iLiVi = 1> m which yi > defines the probability of 
choosing the strategy i. 

If a player plays the game according with the strategy defined by the Nash 
equilibrium, then we say that the player is using a Nash strategy. The Nash 
strategy safeguards each player against attempts by any one player to further 
improve on his individual performance criterion. Moreover, each player knows 
the expected cost for the game solution (y*,z*). For player 1 the expected cost 
is y* Az*, and for player 2 is y* Bz*. For the case of two players, the Nash 
equilibrium can be found using quadratic programming. In general, for multi- 
player games, the Nash equilibrium is found using non-linear programming. 

This solution generally assumes that the player know each other's cost matrices 
and that, when the strategies have been calculated, they are announced at the 
same instant of time. 

Note that the Nash strategy does not correspond in general with the security 
strategy. When the game has a unique Nash equilibrium for pure strategies, then 
the Nash equilibrium maximizes the security strategy. 
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Chapter 10 

Sequential Decision Theory 



Chapter Status 




What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



These are class notes from CS497 Spring 2003, parts of which were scribed by 
Xiaolei Li, Warren Shen, Sherwin Tarn. 

10.1 Basic Definitions 

Notation 

U (xk) the set of decision maker actions from the state Xk 
u k e U(x k ) 

G(xfc) the set of actions nature can perform in state x k 



9k G Q(xk,Uk) like above, except that nature responds to decision maker 
The state transition equation: Xk+i = f(xk, Uk, 9k) 



Use termination actions as before. Also, assume the current state is always 
known. 

10.1.1 Non-Deterministic Forward Projection 

If nature is non-deterministic, what will our next state be given our current state 
and our action that we apply? 

Let 9k be whatever action nature does after we apply Uk in state Xk, and 
F(xk,Uk) be the set of states that we can be in after Uk and 9k is applied when 



9 k e e(x k ) 



The cost functional: L = £) i=1 l(xi, u i: Oi) + If(%f) 
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we were in state x k . So, we have 9 k € Q(x k ,u k ) and F(x k ,u k ) C X, where X is 
the set of all states. 

In the non-deterministic case, we have 

f{x k ,u k ) = {x k +i | 39 k E Q(x k ,u k ) such that x k +i = f(x k ,u k ,9 k )} 

This is a 1-stage forward projection, where we project all possible states one 
step forward. Here is a 2-stage forward projection: 

F(F(x k ,u k ),u k+1 ) 

This can be further expanded to any number of stages. 

10.1.2 Probabilistic Forward Projection 

In the probabilistic case, assume we have a probability distribution over the pos- 
sible actions nature can do given our action u k in state x k . Thus, the probability 
that nature performs action 9 k could be written as 

P(9 k | xi • • • x k , m, • • • u k , 6>i • • • 9 k -i) 

which is the probability given everything that has happened in the past. 

This is too big - so our solution is to arbitrarily knock out stuff until we're 
happy. To make it not so arbitrary, we'll go by the Markovian assumption and 
say that the probability depends only on local values. So, now we have 

P(9 k | x k ) 

Now, given this probability as well as the fact that we're in state x k and apply 
action u k) we want to get the probability that we'll get to state x k+ \. We can 
simply combine the state transition function, x k+ i = f(x k ,u k ,9 k ), and P(9 k , \ x k ) 
to get P(xk+i | x k ,u k ). This is a 1-state probabiltic forward projection. 

A 2-stage probabilistic forward projection is the probability that we'll get to 
a state x k +2 from x k . This probability is P(x k +2 \ x k ,u k ,u k+ i). In order to get 
this to a form we know, we can marginalize variables and get 

P{x k+ 2 | X k ,U k ,U k+1 ) = ^ P{x k+ 2 | X k+1 ,U k+1 )P(x k+1 | X k ,U k ) 

10.1.3 Strategies 

A strategy 7 : x — > u is a plan that tells the decision maker what action to 
take given a state {u k = 7(2:^)). Remember that our c ost functional is L = 
h( x i) + Yli=i h x ii u ii ®i) + h( x f)- How should we choose our strategy? 

In the non-deterministic case, we can think of our strategy problem as a table, 
where each row specifies a state we could be in right now, and each column specifies 
states we could go to if we applied 7 (2^). Wherever there is a 1, that means that 
the state is reachable from our current state. 
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For example, if we were in state 2 and we applied the strategy that this table 
represents, we could end up in l,3...n, but not 2 because there is a in that 
position. Since nature is non-deterministic, we don't know which state we'll end 
up in. 

So, given our cost functional L, we want to choose 7 to minimize the worst-case 
cost. 

In the probabilistic case, we can also think of our strategy in terms of a table, 
except that instead of having boolean values in the entries, we have probabilities: 





1 


2 


3 . . 


.n 


1 


.6 


.02 


.01 . . 


. .3 


2 


.02 


.4 


.2 . . 


. .13 


3 


.2 


.4 


.02 . . 


. .2 


n 


0.0 


.14 


.33 . . 


. .43 



For example, in the entry (2, 3), we have P(x k+1 = 3 | x k = 2, w fc = 7(2)) = .2. 
Given our cost functional L, in the probabilistic case, we want to choose 7 to 
minimize the expected cost. 



10.2 Dynamic Programming over Discrete Spaces 

Setting it up As before, we have stages 1, 2, • • • , K — 1, K, F. Here are some 
definitions, which have been adapted from our dynamic programming for games 
without nature: 

In the non-deterministic case: 



L* F)F (x F ) = lf(x f ) 

L* KF (x K ) = minmax {l(x K ,u K , 0k) + If(xf)} 

u K e K 



K 



L*f( x i) = minmaxminmax • • • minmax < lj(xi) + > l(x i: u i: Oj) + If{xf) 

Ui 1 U2 #2 U K 9 K 1 ^— ' 
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In the probabilistic case: 



L* FF (x F ) 


= l A x f) 


l k,f( x k) 


= min{E t 

U K 


l i,fM 


= min < 

Ul—U k 



■8k 



K 



ll(xi) + ^ l ( X " d i) + If(x F ) 



i=l 



where Eq k is the expectation of 9 K , and E 9l ...g K is the expectation over 9\ ■ ■ ■ 9 K 

Finding the minimum loss Now, to find the loss from any stage k, we use 
dynamic programming, as before. In the non-deterministic case we have 

L *kF( x k)= min max {L* k+1 F (x k+1 ) + l(x k ,u k ,9 k )} 
u k eu(x k ) e K £&(x k ,u k ) 

where x k +i is defined by our state transition equation x k +i = f(x k ,u k ,9 k ). 

In the probabilistic case we have 
l I,fM = ™V ^ i E() k [ L l+iA x k+i) + K x k, u k , 9k)] } 

u k eU(x k ) 



= min <y^XL* k+lF (x k+1 ) + l(x k ,u k ,9 k ))P(9 k 

u k eu(x k ) 

However, if l(x k ,u k ,9 k ) = l(x k ,u k ), then our formula becomes 



x k ,u k ) 



L l F( x k) = min { l{x k ,u k ) + V] (L* k+1 F (x k+1 ))P(x k+l \ x k , u k ) 

Here, we've just made the 9 go away, but in reality it's just hiding inside of 
P{x k+ i | x k ,u k ). 

Issues with Cycles As before assume termination actions to end our game 
when we reach a goal state, and also assume that K is unknown. What if there 
are cycles in our problem, where a series of actions could potentially bring you 
back to the same state? How do we make sure that our program terminates? 

In the non-deterministic case, there must be no negative cycles (in reality, 
there must be no minimax cycles), and there also must be a way to escape or 
avoid all positive cycles. If there were negative cycles, meaning that even with 
nature we can perform actions in a cycle such that we still have negative loss, then 
the optimal strategy would be to go around forever to minimize loss. If there were 
positive cycles that we couldn't escape from or avoid, then it would be possible 
for nature to keep on sending us through the cycle forever. 

In the probabilistic long as no transitions at the start of a cycle is 1, 

then we will terminate. For example, suppose we had this graph, where nodes are 
states and edges are transitions: 



10.3. INFINITE HORIZON PROBLEMS 



All 



K 



P = 1/2 



P = 1/2 



o 



o 



Assume that all transitions have a loss of 1. If we were at state K, we would 
want to go to xq- If we go straight from x\ to x g , we we'll have a total loss of 3. 
However, at K there is a chance of 1/2 that we will go around the cycle. If we go 
around, we'll acquire an additional loss of 4 each time. Thus, the expected loss 
becomes 



If the probabilities are less than 1, then the expected loss converges on some 
finite value, meaning that we will terminate. However, no matter how far we go in 
the future, there will be some exponentially small chance that we will keep going 
around the cycle. To calculate L* k) when do we stop? We can pick some threshold 



10.3 Infinite Horizon Problems 

In an infinite-horizon MDP, K (number of stages) is infinite and there are no 
termination actions. In this situation, the accumulated loss (X^i K x ii u 'i)) wm 
be oo. This means that we will end up with an oo-loss plan, which would be quite 
useless. There are two solutions to the problem and they are described below. 
The first one is to average the loss-per-stage and derive the limit. The second is 
to discount losses in the future. We will look at both of them but focus in depth 
on the latter. 



E[L] 




< 



oo 
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10.3.1 Average Loss-Per-Stage 

The intuition behind this idea is to basically limit the horizon. In this manner, 
we could figure out the average loss per stage and calculate the limit as K — > oo. 
The exact equation is shown below. 



10.3.2 Discounted Loss 

An alternative to the average loss-per-stage scheme is the concept of discounted 
loss. The intuition is that losses in the far future do not count too much. So the 
discounted loss scheme will gradually reduce the losses in the future to zero. This 
will force K x ii u i-> ^) ^° converge. The exact definition of the discounted loss 
functional shown below. The a is known as the discount factor. A larger a gives 
more weight to the future. 



With the above definition, it is clear that lim^oo d 1 ^ 1 = 0. Thus as % approaches 
oo, the term inside the summation will be 0. Therefore, the entire equation will 
converge. 

10.3.3 Optimization in the Discounted Loss Model 

Using the discounted loss model described in the previous section, it is now pos- 
sible to optimize infinite-horizon MDPs using dynamic programming (DP). We 
need to find the best policy (7) such that L is minimized (i.e., optimize Equation 
(10.1)). Before we look at dynamic programming, let us examine how L accu- 
mulates as K increases. When K — 1, there are no losses. As K increments, 
additional loss terms are attached on as shown below. 





< a < 1 



(10.1) 
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Figure 10.1: Discounted Loss Growth 
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10.3.4 Forward Dynamic Programming 

From Figure (10.1), we can easily envision how forward dynamic programming 
can solve for L. We can set L\ to and at each iteration after, find the best next 
step. In other words, search through all possible 7's and find the one that gives 
the least k + \ where % is the current stage. As i increases, each \L* +1 — L*\ will 
get smaller and smaller because lim^oo a 1 ^ 1 = 0. And we can use a condition 
similar to Equation (??) that will allow the DP to stop after so many stages. This 
process sounds fairly easy on paper but turns out to be rather difficult in practice. 
Therefore, we will instead use backwards dynamic programming. 

10.3.5 Backwards Dynamic Programming 

Similar to forward dynamic programming, the backwards method will work in an 
iterative fashion. The main difference is that it will start at the end. What is the 
end for our problem? It's stage K. But in the infinite-horizon MDP, K is equal 
to 00. This presents a problem in that we cannot annotate stage 00; we will use 
a notational trick to get around this problem. 

Recall in Figure 10.1 that each dynamic programming step added a term to 
L. In the forward DP method, we can envision this process as shown in Figure 
10.2. In the backward DP method, we can envision the growth pattern in Figure 
10.2 as being flipped upside down in Figure 10.3. 



Figure 10.2: FDP Growth Figure 10.3: BDP Growth 



An observation we could make about Figure 10.3 is that the bottom of the 
stage list is growing into the past. In other words, the stages in the previous 
step of the DP is being slid into the future. Due to discounted loss, we will need 
to multiple them by a because they're now further in the future. To make this 
process natural in terms of notation, we will define a new term J* as below. 

J* K -kM = a~ k Ll{x k ) 

For example, if K was equal to 5, L* b will be equal to Jq and L\ will be equal to 
J|. Intuitively, J* is the expected loss for an i-stage optimal strategy. Recall that 
the original dynamic programming had the solution of: 
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L* K (x) = VxGl 
L* k (x)= min Eg k {a k l(x k ,u k ,9 k ) + L* kl (f(x k ,u k ,9 k ))} (10.2) 

u k eU(x k ) 

Equipped with the new J notation, we will re-write Equation (10.2) as the 
following by replacing all L's with J's. 

a k r K _ k {x k ) = min Eg k {a k l(x k , u k , 6 k ) + a k+1 J* K ^ k ^(f(x k , u k , 8 k ))} 

u k eu(x k ) 

We will then divide out a k from the equation and also re- write (K — k) as i. This 
will leave us with the following. 

J*(x k )= min E dk {l(x k ,u k ,9 k ) + aJ-.^/^,^,^))} 

u k £U(x k ) 

And more generally, 

J*(x)= min E e {l(x,u,9) + aJ*(f(x,u,9))} 

u&U(x) 

Note that now it is possible to enumerate through the backwards DP by starting at 
Jq . It would be just like solving the original BDP by starting at L* K . Furthermore, 
if we removed the min term in front of the equation, it will also allow us to evaluate 
a particular strategy: 



J^x) = E e {l(x,u,9) + aJ*{f(x,u,6))} 



It is also possible that our loss function could be independent of nature. That is 
l(u,x,9) = l(x,u). We can then further simplify the last pair of equations to the 
following. For simplicity, we will rewrite f(x,u,9) as x' . 

J*{x)= min \ l(x,u) + a ^ P(x'\x,u)J*(x') I (10.3) 
ueu(x)[ J 

J 7 (x) = l(x, u) + a ^2 P(x'\x, u) J 7 (x') (10.4) 

x' 

Notice that the loss function no longer has 9 as a parameter. This allows 
us to remove the expectation of nature from the equation. However, since x' 
still depends on nature, we simply wrote out the definition of expectation as a 
weighted sum of all x n s. This hides 9 amongst the probabilities. For a given 
fixed strategy, it is now possible to find J* by iteratively evaluating Equation 
(10.4) until a condition such as Equation (??) is satisfied. This is known as value 
iteration. 
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10.3.6 Policy Iteration 

The method of finding an optimal strategy, 7*, using Equation (10.3) is known as 
policy iteration. The process can be summarized below. 

1. Guess a strategy 7. 

2. Evaluate 7 using Equation (10.4). 

3. Use Equation (10.3) to find an improved 7'. 

4. Go back to Step 2 and repeat until no improvements occur in Step 3. 

Example We shall illustrate the above algorithm through a simple example. 
Suppose we have X = {1,2} and U = {a, b}. Let Figures 10.4 and 10.5 be the 
probabilities of actions a and b. In addition, let the discount factor a equal ^. 




Figure 10.4: Action a Figure 10.5: Action b 

Assuming that l(x,u,9) = l(x,u), we have the following loss values. 

/(l,a)=2 Z(l,6) = | 
1(2, a) = 1 Z(2, 6) = 3 

Now, let us follow the algorithm described earlier. Step 1 is to choose an 
initial 7. We will randomly choose one that is 7(1) = a and 7(2) = b. In other 
words, choose action a when in state 1 and choose action b when in state 2. 

Step 2 is to evaluate 7 using Equation (10.4). This results in the following 
pair of equations. With them, we see that there are two unknowns with two 
equations and thus can be easily solved. 

J 7 (l) = /(l,a) + ^Qj 7 (l) + ^J 7 (2)) 

J 7 (2) = Z(2,6) + ^Qj 7 (l) + ^J 7 (2)) 
J 7 (l) = 24.12 J 7 (2) = 25.96 
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Step 3 is to minimize J 7 . With the answers above, we can now evaluate 
Equation (10.3) by putting them in place of J*(x'). This will let us find a new 7 
which we can repeat in Step 2 (which will turn out to be 7(1) = b and 7(2) = a). 
This process is relatively simple and is guaranteed to find a global minimum. 
However, when the number of states are large and the number of actions are 
large, the system of equations can become impossible to solve practically. 

10.4 Dynamic Programming over Continuous Spaces 

/ wrote this in 1997 for CS326a at Stanford. It needs to be better integrated in to 
the current context. It should also be enhanced to allow nature to be added. 

This section describes how the dynamic programming principle can be used 
to compute optimal motion plans. Optimality is expressed with respect to a 
desired criterion, and the method can only provide a solution that is optimal for 
a specified resolution. The method generally applies to a variety of holonomic 
and nonholonomic problems, and can be adapted to many other problems that 
involve complications such as stochastic uncertainty in prediction. The primary 
drawback of the approach is that the computation time and space are exponential 
in the dimension of the C-space. This limits its applicability to three or four- 
dimensional problems (however, for many problems it is the only known method 
to obtain optimal solutions). Although there are connections between dynamic 
programming in this context and in graph search, its use in these notes applies 
to continuous spaces. The dynamic programming formulation presented here is 
more similar to what appears in optimal control literature [19, 110, 433]. 

10.4.1 Reformulating Motion Planning 

Recall that the goal of the basic motion planning problem is to compute a path 
r : [0, 1] — > C/ ree such that r(0) = q^u and r(l) = q goa i, when such a path exists. 
In the case of nonholonomic systems, velocity constraints must additionally be 
satisfied. 

We are next going to add some new concepts to the standard motion planning 
formulation. First, it will be helpful to define time, both for motion planning 
problems that vary over time and to help in the upcoming concepts. Since time 
is irrelevant for basic motion planning, it can be considered in this case as an 
auxiliary variable that only assists in the formulation. Suppose that there is some 
initial time, t — 0, at which the robot is at qw. Suppose also that there is some 
final time, Tf (one would like to at least have the robot at the goal before t = Tf). 

Recall that q represents the velocity of the robot in the configuration space. 
Suppose that C is an m-dimensional configuration space, and that u is a continu- 
ous, vector-valued function that depends on time: u : [0, Tf] — > W 71 . If we select u, 
and let q(t) = u(t), then a trajectory has been specified for the robot: q(0) = q^a 
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and for < t < Tf, we can obtain 



The function u can be considered as a control input, because it allows us to move 
the robot by specifying its velocity. As will be seen shortly, the case of q(t) = u{t) 
corresponds to a holonomic planning problem. Suppose that we can choose any 
control input such that for all t it is either normalized, = 1, or u{t) = 0. 

This implies that we can locally move the robot in any allowable direction from its 
tangent space. For nonholonomic problems, one will only be allowed to move the 
robot through a function of the form q = f(q(t),u(t)). For example, as described 
in [437], p. 432, the equations for the nonholonomic car robot can be expressed 
as x = vcos(9), y = v sin(#), and 9 — ^ tan(0). Using the notation in these notes, 
(x,y,9) becomes (?i,?2,?3), and (v, (f>) becomes (wi,-^)- The function / can be 
considered as a kind of interface between the user and the robot. Commands are 
specified through u(t), but the resulting velocities in the configuration space get 
transformed using / (which in general prevents the user from directly controlling 
velocities). 

Incorporating optimality If we want to consider optimality, then it will be 
helpful to define a function that assigns a cost to a given trajectory that is executed 
by the robot. One can also make this cost depend on the control function. For 
example, if the control is an accelerator of a car, then one might want to penalize 
rapid accelerations which use more fuel. A loss functional is defined that evaluates 
any configuration trajectory and control function: 



The integrand l(q(t),u(t)) represents an instantaneous cost, which when inte- 
grated can be imagined as the total amount of energy that is expended. The term 
Q(q(Tf)) is a final cost that can be used to induce a preference over trajectories 
that terminate in a goal region of the configuration space. 

The loss functional can reduced to a binary function when encoding a basic 
path planning problem that does not involve optimality. The loss functional can 
be simplified to L — Q(q(Tf)). We take Q(q(Tf)) = if qiTf) = q goa h an d 
Q(q(Tf)) = 1 otherwise. This partitions the space of controls functions into two 
classes: control functions that cause the basic motion planning problem to be 
solved receive zero loss; otherwise, unit loss is received. 

The previous formulation considered all control inputs that achieve the goal 
to be equivalent. As another example, the following measures the path length for 
control inputs that lead to the goal: 




(10.5) 




(10.6) 
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The term J Q / \\q(t)\\dt measures path length, and recall that q(t) = u(t) for all t. 

There is a small technicality about considering optimal collision-free paths. 
For example, the visibility-graph method produces optimal solutions, but these 
paths must graze the obstacles. Any path that maps into C/ ree can be replaced by 
a shorter path that still maps into C/ ree , but might come closer to obstacles. The 
problem exists because C/ ree is an open set, and can be fixed by allowing the path 
to map into C va ud (which is the closure of C/ ree . If one still must use C/ ree , then 
the optimal path that maps into C va ud will represent an infinum (a lower bound 
that can't quite be reached) over paths that only map into Cf ree . 

A discrete-time representation The motion planning problem can alterna- 
tively be characterized in discrete time. For the systems that we will consider, 
discrete-time representations can provide arbitrarily close approximations to the 
continuous case, and facilitate the development of the dynamic programming al- 
gorithm. 

With the discretization of time, [0, Tf] is partitioned into stages, denoted by 
k G {1, . . . , K + 1}. Stage k refers to time (k — 1) At. The final stage is given by 
K = [Tf/At\. Let qk represent the configuration at stage k. At each stage k, an 
action Uk can be chosen from an action space U. Because 



in which qk = q{t), qk+i — q{t + At), and Uk = u(t). As an example of how 
this representation approximates the basic motion planning problem, consider the 
following example. Suppose C/ ree C IR 2 . It is assumed that ||ufc|| = 1 and, hence, 
the space of possible actions can be sufficiently characterized by the parameter 
4>k £ [0, 2tt). The discrete-time transition equation becomes 



At each stage, the direction of motion is controlled by selecting <f> k . Any K- 
segment polygonal curve of length KAt can be obtained as a possible trajectory 
of the system. If an action is included that causes no motion, shorter polygonal 
curves can also be obtained. 

In general, a variety of holonomic and nonholonomic problems can also be ap- 
proximated in discrete time. The equation q = f(q(t),u(t)) can be approximated 
by a transition equation of the form q k+ i = fk(qk, u k)- 

A discrete-time representation of the loss functional can also be defined: 




(10.8) 



the equation q(t) = u(t) can be approximated as 

q k+1 = q k + Atu k , 



(10.9) 




(10.10) 



K 



L(q 1 , . . . ,q F ,Ui, . . . ,u K ) = y~] k(qk, u k ) + Ik+i(Qf), 

k=i 



(10.11) 
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in which Ik and Ir+i serve the same purpose as I and Q in the continuous-time 
loss functional. 

The basic motion planning problem can be represented in discrete time by 
letting Ik — for all k e {1, ... , K}, and defining the final term as Ik+i{<1f) = 
if qu = q g0 ai, and Ik+i^f) = 1 otherwise. This gives equal preference to all 
trajectories that reach the goal. To approximate the problem of planning an 
optimal-length path, Ik — 1 for all k e {1, ... , K}. The final term is then defined 
as lK+i{qF) = if q k = q g0 ai, and Ik+i^f) = oo otherwise. 

10.4.2 The Algorithm 

This section presents algorithm issues that result from computing approximate 
optimal motion strategies. Variations of this algorithm, which apply to a variety 
of motion planning problems are discussed in detail in [448]. The quality of 
this approximation depends on the resolution of the representation chosen for 
the configuration space and action space. The efforts are restricted to obtaining 
approximate solutions for three primary reasons: 1) known lower-bound hardness 
results for basic motion planning and a variety of extensions; 2) exact methods 
often depend strongly on specialized analysis for a specific problem class; and 3) 
the set of related optimal-control and dynamic-game problems for which analytical 
solutions are available is quite restrictive. The computational hardness results 
have curbed many efforts to find efficient, complete algorithms to general motion 
planning problems. In [651] the basic motion planning problem was shown to 
be PSPACE-hard for polyhedral robots with n links. In [122] is was shown that 
computing minimum-distance paths in a 3-D workspace is NP-hard. It was also 
shown that the compliant motion control problem with sensing uncertainty is 
nondeterministic exponential time hard. In [652] it was shown that planning the 
motion of a disk in a 3-D environment with rotating obstacles is PSPACE-hard. In 
[654], a 3-D pursuit-evasion problem is shown to be exponential time hard, even 
though there is perfect sensing information. Such results have turned motion 
planning efforts toward approximate techniques. For example, a polynomial-time 
algorithm is given in [606] for computing epsilon approximations of minimum- 
distance paths in a 3-D environment. Also, randomized techniques are used to 
compute solutions for high degree-of-freedom problems that are unapproachable 
by complete methods [16, 51, 383, 734]. 

The second motivation for considering approximate solutions is to avoid spe- 
cialized analysis of particular cases, with the intent of allowing the algorithms to 
be adaptable to other problem classes. Of course, in many cases there is great 
value in obtaining an exact solutions to a specialized class of problems. The ap- 
proach described in this paper can be considered as a general way to approximate 
solutions that might be sufficient for a particular application, or the approach 
might at least provide some understanding of the solutions. 

The final motivation for considering approximate solutions is that the class of 
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related optimal-control and dynamic-game problems that can be solved directly 
is fairly restrictive. In both control theory and dynamic game theory, the classic 
set of problems that can be solved are those with a linear transition equation and 
quadratic loss functional [19, 37, 110]. 

The algorithm description is organized into three parts. First, the general 
principle of optimality is described, which greatly reduces the amount of effort 
that is required to compute optimal strategies. The next part describes how cost- 
to-go functions are computed as an intermediate representation of the optimal 
strategy. The third part describes how the cost-to-go is used as a navigation 
function to execute the represented strategy (i.e., selecting optimal actions during 
on-line execution). Following this, basic complexity assessments are given. 



Exploiting the principle of optimality Because the decision making ex- 
pressed in q k+ i = f k (q k ,u k ) is iterative, the dynamic programming principle can 
generally be employed to avoid brute-force enumeration of alternative strategies, 
and it forms the basis of our algorithm. Although there are obvious connections 
to dynamic programming in graph search, it is important to note the distinctions 
between Dijkstra's algorithm and the usage of the dynamic programming princi- 
ple in these notes. In optimal control theory, the dynamic programming principle 
is represented as a differential equation (or difference equation in discrete time) 
that can be used to directly solve a problem such as the linear-quadratic Gaussian 
regulator [420], or can be used for computing numerical approximations of opti- 
mal strategies [431]. In the general case, the differential equation is expressed in 
terms of time-dependent cost-to-go functions. The cost-to-go is a function on the 
configuration space that expresses the cost that is received under the implementa- 
tion of an optimal strategy from that particular configuration and time. In some 
cases, the time index can be eliminated, as in the special case of values stored at 
vertices in the execution of Dijkstra's algorithm. 

For the discrete-time model, the dynamic programming principle is expressed 
as a difference equation (in continuous time it becomes a differential equation). 
The cost-to-go function at stage k is defined as 



K 



L* k {q k )= min { V h(q F , Ui ) + l K+1 (q F ) } . (10.12) 

U k ,..,U K ^ j 

The cost-to-go can be separated: 

L* k (q k ) = mm min < l k (q k ,u k ) + l k (q F ,Ui(q F )) + l K+1 (q F ) } . (10.13) 

u k U k+1 ,...,U K 

\ t=k+l ) 

The second min does not affect the l k term; thus, it can be removed to obtain 

K \ ' 

. (10.14) 



L k(Qk) = min 



h{qk,u k )+ min { V] l k (q F , Ui(q F )) + Ir+Mf) 



U k + 1 ,...,U K 



J=k+1 
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The second portion of the min represents the cost-to-go function for stage k + 1, 
yielding [63]: 

L l(<lk) = min {l k (qk,u k (q k )) + L* k+1 (q k+1 )} . (10.15) 

This final form represents a powerful constraint on the set of optimal strategies. 
The optimal strategy at stage k and configuration q depends only cost-to-go val- 
ues at stage k + 1. Furthermore, only the particular cost-to-go values that are 
reachable from the transition equation, q k+ i = f(q k ,u k ), need to be considered. 
The dependencies are local; yet, the globally-optimal strategy is characterized. 

Iteratively approximating cost-to-go functions An optimal strategy can 
be computed by successively building approximate representations of the cost-to- 
go functions. One straightforward way to represent a cost-to-go function is to 
specify its values at each location in a discretized representation of the config- 
uration space. Note that this requires visiting the entire configuration space to 
determine a strategy. Instead of a path, however, the resulting solution can be 
considered as a feedback strategy. From any configuration, the optimal action will 
be easily determined. Note that the cost-to-go function is encoding a globally- 
optimal solution which must take into account all of the appropriate geometric and 
topological information at a given resolution. Artificial potential functions have 
often been constructed very efficiently in motion planning approaches; however, 
these approaches heuristically estimate the cost-to-go and are typically prone to 
have local minima [51, 394]. 

The first step is to construct a representation of L* K+l . The final term, 
lK+i(qK+i), of the loss functional is directly used to assign values of L* K+1 (q F ) 
at discretized locations. Typically, /k+i(<?f) = if qp lies in the goal region, and 
Zk+i(<Zf) = oo otherwise. This only permits trajectories that terminate in the 
goal region. If the goal is a point, it might be necessary to expand the goal into 
a region that includes some of the quantized configurations. 

The dynamic programming equation (10.15) is used to compute the next cost- 
to-go function, L* K , and subsequent cost-to-go functions. For each quantized con- 
figuration, q k , a quantized set of actions u k G U are evaluated. For a given action 
u k , the next configuration obtained by q k+ i = f{q k ,u k ) generally might not lie 
on a quantized configuration. See Figure 10. 6. a. Linear interpolation between 
neighboring quantized configurations can be used, however, to obtain the appro- 
priate loss value without restricting the motions to the grid (see Figure 10. 6. a). 
Suppose for example, that for a one-dimensional configuration space, and 
L* k+1 [i + 1] represent the loss values for some configurations q,i and g £+1 . Suppose 
that the transition equation, f k , yields some q that is between qi and qi+i. Let 

Qi+i - q 

a = 

- qi 

Note that a = 1 when q = qi and a = when q = q i+ i. The interpolated loss can 



(10.16) 
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Figure 10.6: The computations are illustrated with a one-dimensional configura- 
tion space, (a) The cost-to-go is obtained from at the next stage by interpolation 
of the values at the neighboring quantized configurations, (b) During execution, 
interpolation can also be used to obtain a smooth trajectory. 

be expressed as 

Ll +1 {q k+l ) » aLl +1 [i] + (1 - a)Ll +1 [i + 1]. (10.17) 

In an m-dimensional C-space, interpolation can be performed between 2 m 
neighbors. For example, if C = R 2 , the interpolation can be computed as 

(10.18) 

in which a, (3 G [0, 1] are coefficients that express the normalized distance to the 
neighbors in the q\ and qi directions, respectively. For example a — 1, and (5 — 
when q k+ i lies at the configuration represented by index + 1]. Other schemes, 
such as quadratic interpolation, can be used to improve numerical accuracy at 
the expense of computation time [433]. Convergence properties of the quanti- 
zation and interpolation are discussed in [63, 69]. Interpolation represents an 
important step that overcomes the problems of measuring Manhattan distance 
due to quantization. Note that for some problems, however, interpolation might 
not be necessary. Suppose for example, that the robot is a manipulator that 
has independently-controlled joints. During each stage, each joint can be moved 
clockwise, counterclockwise, or not at all. These choices will naturally result in 
motions that fall directly onto a grid in the configuration space. 
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For a motion planning problem, the obstacle constraints must additionally 
be taken into account. The constraints can be directly evaluated each time to 
determine whether each qk+i lies in the free space, or a bitmap representation of 
the configuration space can be used for quick evaluations (an efficient algorithm 
for building a bitmap representation of C/ ree is given in [385]). 

Note that L* K represents the cost of the optimal one-stage strategy from each 
configuration q k . More generally, L* K _ i represents the cost of the optimal (i + 1)- 
stage strategy from each configuration qic+i- For a motion planning problem, 
one is typically concerned only with strategies that require a finite number of 
stages before terminating in the goal region. For a small, positive 5 the dynamic 
programming iterations are terminated when \L* k {q k ) — L k+1 {q k+1 )\ < 5 for all 
values in the configuration space. This assumes that the robot is capable of 
selecting actions that halt it in the goal region. The resulting stabilized cost-to-go 
function can be considered as a representation of the optimal strategy. Note that 
no choice of K is necessary because termination occurs when the loss values have 
stabilized. Also, only the representation of L* k+l is retained while constructing L* k \ 
earlier representations can be discarded to save storage space. 

The general advantages of these kinds of computations were noted long ago in 
[431]: 1) extremely general types of system equations, performance criteria, and 
constraints can be handled; 2) particular questions of existence and uniqueness 
are avoided; 3) a true feedback solution is directly generated. 

Using the cost-to-go as a navigation function To execute the optimal strat- 
egy, an appropriate action must be chosen using the cost-to-go representation from 
any given configuration (see Figure 10. 6. b). One approach would be to simply 
store the action that produced the optimal cost-to-go value, for each quantized 
configuration. The appropriate action could then be selected by recalling the 
stored action at the nearest quantized configuration. This method could cause er- 
rors, particularly since it does not utilize any benefits of interpolation. A preferred 
alternative is to select actions by locally evaluating (10.15) at the exact current 
configuration. Linear interpolation can be used as before. Note that although 
the approach to select the action is local (and efficient), the global information is 
still taken into account (it is encoded in the cost-to-go function). This concept is 
similar to the use of a numerical navigation function in previous motion planning 
literature [51, 659] (such as NF1 or NF2), and the cost-to-go is a form of progress 
measure, as considered in [230]. When considering the cost-to-go as a navigation 
function, it is important to note that it does not contain local minima because 
it is constructed as a by-product of determining the optimal solution. Once the 
optimal action is determined, an exact next configuration is obtained (i.e., not a 
quantized configuration). This form of iteration continues until the goal is reached 
or a termination condition is met. During the time between stages, the trajectory 
can be linearly interpolated between the endpoints given by the discrete-time tran- 
sition equation, or can be integrated using an original continuous-time transition 
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equation. 

Computational expense Consider the computation time for the dynamic pro- 
gramming algorithm for the basic case modeled by (10.15). Let c denote the 
number of quantized values per axis of the configuration space. Let m denote 
the dimension of the configuration space. Let a denote the number of quan- 
tized actions. Each stage of the cost-to-go computations takes time 0(c m a), and 
the number of stages before stabilization is nearly equal to the longest optimal 
trajectory (in terms of the number of stages) that reaches the goal. The space 
complexity is obviously 0(c m ). The algorithm is efficient for fixed dimension, 
yet suffers from the exponential dependence on dimension that appears in most 
deterministic motion planning algorithms. The utilization of the cost-to-go func- 
tion during execution requires 0(a) time in each stage. These time complexities 
assume constant evaluation time of the cost-to-go at the next stage; however, if 
multilinear interpolation is used, then additional exponential-time computation is 
added because 2 m neighbors are evaluated. 

10.5 Reinforcement Learning 

We can now extend the infinite-horizon MDP problem by assuming that P(x'\x, u) 
in Equations (10.3) and (10.4) is unknown. This is essentially saying that we have 
no idea what the distributions of nature are. Traditionally, this hurdle is handled 
by the following steps. 

1. Learning phase (Travel through the states in X, try various actions, and 
gather statistics.) 

2. Planning phase (Use value iteration or policy iteration to computer J* and 

7*0 

3. Execution phase. 

In the learning phase, if the number of trials is sufficiently large, P(x'\x,u) can 
be estimated relatively well. Also during the learning phase, we can observe the 
losses associated with states and actions. If we combine the three steps above and 
run the world through a Monte Carlo simulator, we get reinforcement learning. 
Figure 10.7 shows an outline of the architecture. 

A major issue of reinforcement learning is the problem of exploration vs. ex- 
ploitation. The goal of exploration is to try to gather more information about 
P(x'\x,u), but it might end up choosing actions that yield high losses. The goal 
of exploitation is to make good decisions based on knowledge of P(x'\x,u), but 
it might fail to learn a better solution. Pure exploitation is vulnerable to getting 
stuck to a bad solution while pure exploration requires lots of resources and might 
never be used. 
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Figure 10.7: Reinforcement Learning Architecture 



10.5.1 Stochastic Iterative Algorithms 

Recall that the original evaluation of a particular strategy was: 

J 7 (x) = l(x, u) + a P(x'\x, u)J 7 (x') 

x' 

But the problem now is that P(x'\x, u) is unknown. Instead, we use what is called 
a stochastic iterative algorithm. J y (x) will be updated with the following equation. 
p is the learning rate. 

J y (x) = (1 — p)J 1 (x) + p(l(x, j(x)) + aJ y (x')) 

In this equation, x' is now observed instead of calculated from f(x,u,9). A 
question a keen reader might ask is where have the probabilities gone? They're 
conspicuously missing in the above equation. The answer is that they're really 
embedded in the observations of x' from nature. In the Monte Carlo simulation, 
states that have high probability will occur more often and thus will make a bigger 
influence to J 7 . In this manner, over time the probabilities distribution of x' will 
be stored in 



10.5.2 Finding an Optimal Strategy: Q-learning 

So how do we find the optimal strategy? The answer lies in Q: rather than 
using just J* : X — > R, the expected loss of a particular strategy, now we use 
Q* : X x U — > R. Q* (x, u) represents the optimal cost-to-go from applying u and 
then continuing on the optimal path after that. Note that Q is independent of 
the policy being followed. 

Using Q*(x,u) in the dynamic programming equation yields: 

Q*(x, u) — l{x, u) + a > P(x'\x,u) min (Q*(x',u')) 

^— ' u'eufx') 

x' 

If we make J*(x) the expected cost for optimal strategy given state x, and 
Q*(x, u) be the expected cost for optimal strategy given state x and using cost u, 



426 



S. M. LaValle: Planning Algorithms 



A 



B 



C 



D 



E 



F 




Figure 10.8: A tree for the extensive form. 



then 




in Q*(x,u) 

J(x) 



However, for reinforcement learning, the probability P(x'\x,u) is unknown, so 
we can bring in the stochastic iterative idea again and get 



Until now we have used matrices to describe the games. This representation is 
called normal form. For sequential games (i.e., parlor games), in which a player 
take a decision based on the outcome of previous decisions of all the players, we 
can use the extensive form to describe the game. 

The rules of a sequential game specify a series of well defined moves, where each 
move is a point of decision for a given player from among a set of alternatives. The 
particular alternative chosen by a player in a given decision point is called choice, 
and the totality of choices available to him at the decision point is defined as the 
move. A sequence of choices, one following another until the game is terminated 
is called a play. The extensive form description of a sequential game consist of 
the following: 

• A finite tree that describes the relation of each move to all other moves. The 
root of the tree is the first move of the game. 

• A partition of the nodes of the tree that indicates which of the players takes 
each move. 

• A refinement of the previous partition into information sets. Nodes that 
belong to the same information set are indistinguishable to the player. 



Q*(x, u) := (1 — p)Q*(x, u) + p(l(x, u) + a min Q*{x,u)) 



u'eu{x') 



10.6 Sequential Game Theory 



• A set of outcomes to each of the plays in the game. 
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Figure 10.8 shows an example of a tree for a sequential game. The numbers 
beside the nodes indicates which player takes the corresponding move. The edges 
are labeled by the corresponding choice selected. The leaves indicate the out- 
come of the play selected (a root-leaf path in the tree). The information sets are 
shown with dashed ellipses around the nodes. Nodes inside the same ellipse are 
indistinguishable for the players, but the players can differentiate nodes from one 
information set to another. If every ellipse enclose only one node, then we say 
that the players have perfect information of the game, which leads to a "feedback 
strategy" . 

In the extensive form all games are described with a tree. For games like chess 
this may not seem reasonable, since the same arrangement of pieces on the board 
can be generated by several different routes. However, for the extensive form, two 
moves are different if they have different past histories, even if they have exactly 
the same possible future moves and outcomes. 

10.6.1 Dynamic Programming over Sequential Games 



10.6.2 Algorithms for Special Games 
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Chapter 11 

The Information Space 



Chapter Status 




What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



Up to now it has been assumed everywhere that the current state is known. 
Suppose instead that the state is not exactly known. In this case, information 
regarding the state is obtained from sensors during the execution of a plan. This 
situation arises in most applications that involve interaction with the physical 
world. For example in robotics, it is virtually impossible for a robot to precisely 
know its state, except in some limited cases. What should be done if there is 
limited information regarding the state? A classical approach is to take all of 
the information available and try to estimate the state. If the estimates are 
sufficiently reliable, then we may safely pretend that there is no uncertainty in 
state information. This enables many of the planning methods introduced so far 
to be applied with only minor adaptation. 

The more interesting case occurs when state estimation is altogether avoided. 
It may be surprising, but many important tasks can be defined and solved with- 
out ever requiring that specific states are reached, even though a state space is 
defined for the planning problem. To achieve this, the planning problem will be 
expressed in terms of an information space. The information space serves the 
same purpose for sensing problems as the configuration space of Chapter 4 did 
for problems that involve geometric transformations. The information space rep- 
resents the place where problems that involve sensing uncertainty naturally live. 
Successfully formulating and solving such problems will depend on our ability to 
manipulate, simplify, and control the information space. In some cases, elegant 
solutions exist, and in others there appears to be no hope at present of efficiently 
solving them. There are many exciting open research problems associated with 
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information spaces and sensing uncertainty in general. 



Sensing 



Machine 



Actuation 



Environment 



Figure 11.1: The state of the environment is not known. The only information 
available to make inferences regarding the state is the history of sensor obser- 
vations, actions that have been applied, and the initial conditions. This history 
becomes the information state. 

Recall the situation depicted in Figure 11.1, which was also shown in Section 
1.4. It is assumed that the state of the environment is not known. There are three 
general sources of infromation regarding the state: 

1. The initial conditions can provide powerful information regarding the state 
before any actions are applied. It might even be the case that the initial 
state is given. At the other extreme, the initial conditions might contain no 
information. 

2. The sensor observations provide measurements of the state during execution. 
These measurements are usually incomplete or involve disturbances that 
distort their values. 

3. The actions already executed in the plan provide valuable information re- 
garding the state. For example, if a robot is commanded to move east (with 
no other uncertainties except an unknown state), then it is expected that 
the state is further east than it was previously. Thus, previously applied 
actions provide important clues for deducing the state. 

Section 11.1 will formalize these concepts for the case of discrete state spaces. 
OTHER SECTIONS. 

There are generally two ways to use the information space: 

1. Take all of the information available, and try to estimate the state. 

This is the classical approach. Pretend that there is no longer any uncer- 
tainty in state, but hope (or prove) that the resulting motion strategy or 
control law works under reasonable estimation error. 
A plan is generally expressed as 7r : X — > U . 

2. Solve the task entirely in terms of an information space. 

Many robot tasks may be achieved without ever knowning the exact state. 
The goals and analysis are formulated in the information space, without the 
need to achieve particular states. 

A plan is generally expressed as ir : X — * U, for an information space, X. 
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The first approach may be considered somewhat classical. Most of the focus of 
the chapter is on the second approach, which represents a powerful way to express 
and solve problems. 

GUIDE TO THE SECTIONS: Section 11.1 will formalize these concepts for 
the case of discrete state spaces. 

11.1 Discrete State Spaces 
11.1.1 Sensors 

As the name suggests, sensors are designed to sense the state. Throughout all of 
Section 11.1 it is assumed that the state space, X is finite or countably infinite, as 
in Formulations 2.2.1 and 2.4.2. A sensor is defined in terms of two components: 
1) the observation space, which is the set of possible readings for the sensor, and 2) 
sensor mapping, which characterizes the readings that can be expected if the state 
is given. In the planning model, the state will not be given, it is only assumed to 
be given when modeling a sensor. 

Let Y denote an observation space, which is a finite or countably infinite set. 
Let h denote the sensor mapping. Three different kinds of sensor mappings will 
be considered, each of which is more complicated and general then the previous 
one: 

1. State sensor mapping: In this case, h : X — > F, which means that given 
the state, the observation is completely determined. 

2. State-nature sensor mapping: In this finite set, ^f(x), of nature 
sensing actions are defined for each x G X. Each nature sensing action, 
ip G ^f(x) interferes with the sensor observation. Therefore, the state-nature 
mapping, h, produces an observation, y = h(x, ip) G Y for every x G X and 
ip G ty(x). The particular ip chosen by nature is assumed to be unknown 
during planning and excution. However, it is specified as part of the sensing 
model. 

3. History-based sensor mapping: In this case, the observation could be 
based on the current state or any previous states. Furthermore, a nature 
sensing action could be applied. Suppose that the current stage is k. The 
set of nature sensing actions is denoted by ty k (x), and the particular nature 
sensing action is ipk G ^k(x)- This yields a very general sensor mapping, 
defined as 

y k = h(x 1 ,. . . ,x k ,ip k ), (11.1) 

in which y k is the observation obtained in stage k. 

Many examples of sensors will now be given. These are provided to illustrate 
the definitions and to provide building blocks that will be used in later examples of 
information spaces. Examples 11.1.1 to 11.1.5 all involve state sensor mappings. 
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Example 11.1.1 (Odd/even sensor) Let X = Z, the set of integers. Let Y = 
{0, 1}. A sensor mapping can be defined as 

v / if x is even, 
f = = ( ! ifxisod d. • ( 1L2 ) 

The limitation of this sensor is that it only tells whether x G X is odd or even. 
When combined with other information, this might be enough to infer the state, 
but in general it provides incomplete information. ■ 



Example 11.1.2 (Mod sensor) Example 11.1.1 can be easily generalized yield 
the remainder when x is divided by k, for some fixed integer k. Let X = Z, and 
let Y — {0, 1, . . . , k — 1}. The sensor mapping is defined as 

y = h(x) = x mod k. (H-3) 



Example 11.1.3 (Sign sensor) Let X = Z, and letF = {— 1,0,1}. The sensor 
mapping is defined as 

y = h(x) = sgnx. (H-4) 

This sensor provides very limited information because it only indicates on which 
side of the boundary x = the state may lie. The one exception is that it can 
precisely determine whether x = or x ^ 0. ■ 



Example 11.1.4 (Selective sensor) Let X = Z x Z, and let (i,j) G X denote 
a state, in which i,j G Z. Suppose that only one component of (i,j) can be 
observed. This yields the sensor mapping 

y = h(i,j)=i. (11.5) 

An obvious generalization can be made for any state space that is formed from 
Cartesian products. The sensor reveals the values of one or more components, 
and other rest remain hidden. ■ 



Example 11.1.5 (Bijective sensor) Let X be any state space, and let Y — X. 
Let the sensor mapping be defined as any bijective function h : X — > Y. This 
sensor provides information that is equivalent to having knowledge of the state. 
Because h is bijective, it can be inverted to obtained h^ 1 : Y — > X. For any 
y G F, the state can be determined as x = h~ 1 (y). 

A special case of the bijective sensor is the identity sensor, for which h is the 
identity function. This was essentially assumed to exist for all planning problems 
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covered before this chapter because it immediately yields the state. However, any 
bijective sensor would serve the same purpose. ■ 



Example 11.1.6 (Null sensor) Let X be any state space, and let Y = {0}. 
The null sensor is obtained by letting defining the sensor mapping as any function 
h : X — > Y. The sensor reading remains fixed, and hence contains no information 
regarding the state. ■ 

From the examples so far, it is tempting to think about partitioning X based on 
sensor observations. Suppose that in general a state mapping, h, is not bijective, 
and let H(y) denote the following subset of X: 

H(y) = {x G X | y = h(x)}, (11.6) 

called the preimage of y. The set of preimages, one for each y G F, form a partition 
of X. In some sense, this indicates the "resolution" of the sensor. A bijective 
sensor partitions X into singleton sets because it contains perfect information. At 
the other extreme, the null sensor partitions X into a single set, X itself. The 
sign sensor appears slightly more useful because it partitions X into three sets: 
H(l) = {1,2,...}, H(-l) = {...,-2,-1}, and H(0) = {0}. The preimages 
of the selective sensor are particularly intersting. For each i 6 Z, H{i) = 7L. 
EXPLAIN CONNECTION TO QUOTIENT GROUPS FOR MOD SENSOR. 

Next consider some examples that involve a state-action sensor mapping. 
There are two different possiblities regarding the model for the nature sensing 
action: 

1. Nondeterministic: In this case, there is no addition information regarding 
which ip G ^>(x) will be chosen. 

2. Probabilistic: A probability distribution is known. In this case, the prob- 
abaility, P(ip\x), that ip will be chosen is known for each ip G ^(x). 

These two possiblities also appeared in Section ?? for nature actions that interfere 
with the state transition equation. 

It is sometimes useful to consider the state-action sensor model as a probability 
distribution over Y for a given state. Suppose that when the domain of h is 
restricted to some x G X, then it forms an injective mapping from ^ to X. In 
other words, every nature action leads to a unique observation, assuming x is 
fixed. Using P(ip) and h, one can easily derive P(y\x) as 

j-j / i f P(tb) for a unique ib such that y = h(x,ib). , _ x 

P(y\x) = < n . r , i , . , • (11.7) 

10 it no such ip exists. 



If the injective assumption is lifted, then P(ip) is replaced by a sum over all ip for 
which y = h(x, ip). 
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Example 11.1.7 (Sensor disturbance) Let X = Z,Y = Z, and * = {-1, 0, 1} 
The idea is to construct a sensor that would be the identity sensor if it were not 
for the interference of nature. The sensor mapping is 

y = h(x, if>) = x + ip. (11.8) 

It is always known that \x — y\ < 1. Therefore, if y is received as a sensor reading, 
one of the following must be true: x — y — 1, x — y, or x — y + 1. ■ 



Example 11.1.8 (Disturbed sign sensor) Let X = Z, Y = { — 1,0,1}, and 
let ^ = { — 1, 0, 1}. Let the sensor mapping be defined as 

y — h(x, ip) — sgn(x + ip). (H-9) 

In this case, if y = 0, it is no longer known for certain whether x — 0. It is possible 
that x — — 1 or x — 1. If x = 0, then it is possible for the sensor to read —1, 0, or 
1. ■ 



Example 11.1.9 (Disturbed odd/even sensor) It is not hard to construct 
examples for which some mild interference from nature destroys all of the infor- 
mation. Let X = Z, Y = {0,1}, and ^ = {0,1}. Let the sensor mapping be 
defined as 

7 / ,n /0 if x + ?/> is even. /iim\ 

v = H^) = [ 1 iix + ijisodd . ■ 

If the value of if) is not known, then the sensor provides no useful information 
regarding the state. For example, it may yield y — 0, but it not known whether 
x is even or odd. If there is a probabilistic model for the nature sensing action, 
then this sensor may provide some useful information. ■ 

It is once aqain informative to consider preimages. For a state-action sensor 
mapping, the preimage is defined as 

H(y) = {x G X | 3ip G fy(x) for which y = h(x,ip)}. (H-H) 

In comparison to state sensor mappings, the preimage sets are larger for state- 
action sensor mappings. They also do not generally form a partition of X. For 
example, the preimages of the Example 11.1.8 are: H(l) = {0,1,...}, H(0) = 
{ — 1, 0, 1}, and H(—l) = {. . . , —2, —1, 0}. This is not a partition because every 
preimage contains 0. 

Finally, one example of a history-based sensor mapping is given. 

Example 11.1.10 (Delayed-observation sensor) Let X = Y = Z. A delayed- 
observation sensor can be defined for some fixed positive integer i as y^ — Xk-%- 
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Figure 11.2: In each stage, k, an observation, yj, G Y is recieved, and an action 
G U is applied. The state, Xk, however, is hidden from the decision maker. 

Thus, it indicates what the state was i stages ago. In this case, it gives a perfect 
measurement of the old state value. Many other variants are possible. For exam- 
ple, it might only give the sign of the state, % stages ago. ■ 



11.1.2 Defining the Information Space 

Suppose that X, U, and / have been defined as in Formulation ??, and the notion 
of stages has been defined as in Formulation I. This yields state sequences x±, x 2 , 
. . . and action sequences ui, 112, ... during the execution of a plan. However, in 
the current setting, the state sequence is not known. Instead, at every stage, an 
observation, y^, is obtained. The process depicted in Figure 11.2. 

In previous formulations, the action space, U(x), was generally allowed to 
depend on x. Since x is currently unknown, it would seem strange to all the 
actions to depend on x. This would mean that inferences could be be made 
regarding the state simply by noticing which actions are available. Instead, it will 
be assumed that U is fixed for all x G X. 

Initial conditions As stated at the begining of the chapter, the initial condi- 
tions provide one of the three general sources of information regarding the state. 
Three alternative types of initial conditions will be allowed: 

1. The initial state, x\ G X is given. This initializes the problem with perfect 
state information. Assuming nature actions interfere with the state transi- 
tion equation /, uncertainty in the current state will generally develop. 

2. A set of states, X\ C X is given. In this case, the initial state is only 
known to lie within a particular subset of X. This can be considered as a 
generalization of the first type, which only allowed singleton subsets. 

3. A probability distribution, P(x), over X is given. 

In general, let i] Q denote the initial condition, which may be any one of the three 
alternative types. 
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History Suppose that that the k stage has passed. What information is avail- 
able? It is will be assumed that at every stage, a sensor observation is made. This 
yields a sensing history, (yi, yi, ■ ■ ■ , yk)- At every stage an action also be applied. 
This yields an action history, {u\, u 2 , ■ ■ ■ , u k -i). Note that the sequence only runs 
to u k _i, instead of u k , because once u k is applied, state x k +\ and stage k + 1 is 
obtained. 

By combining the sensing and action histories, the history, X k , at stage k is 
the sequence 

A fc = (uj, . . . ,u fc _i,3/i, . . .,y k ). (11-12) 

Information state The history, X k , in combination with the initial condition, 
X , yields the information state, which is denoted by r\ k . This correponds to all 
information that is known up to stage k. In spite of the fact that the states, x±, 
. . ., Xk, might not be known, the information states are always known becuase 
they are defined directly in terms of available information. The information state 
may be denoted as 

Vk = (Vo,Ui, ■ ■ ■ ,Uk-i,Vi, ■ ■ .,Vk), (11.13) 
or in short form, r\ k = (i] , \ k ). When representing information spaces, we will gen- 
erally ignore the problem of nesting parentheses; the short form actually expresses 
a sequence of two sequences, while (11.13 is a single sequence. This distinction is 
insignificant for the purposes of decision making. 

The information state, r\ k , can also be expressed as 

Vk = (Vk-i,Uk-i,yk), (11-14) 

by noticing that the information state at stage k contains all of the information 
from the information state at stage k — 1. The only new information is the 
previously applied action, u k =i and the current sensor observation, y k . 

Information space The information space will simply be the set of all possible 
information states. Although the information states appear to be quite compli- 
cated, it is helpful to think of them abstractly as points in a set that is called the 
information space. To define the set of all possible information states, we will need 
careful definitions of the set of all initial conditions, actions, and observations. 

The set of all observations is always Y. Therefore, the set of all observa- 
tion histories is Y k , which is obtained by a Cartesian product of k copies of the 
observation space, Y: 

Y k = YxY...xY. (11.15) 

Similarly, the set of all action histories is given by C/ fe_1 , the Cartesian product of 
k — 1 copies of the action space, U . 

It is slightly more complicated to define the set of all possible initial condi- 
tiations because three different types of initial conditions were possible. Let X 
denote the initial condition space. Depending on which of the three types of initial 
conditions are used, one of the following three definitions of X is used: 
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1. If the initial state, x±, is given, then Xo C X. Typically, Xo = X; however, it 
might be known in some instances that certain initial states are impossible. 
Therefore, it is generally written that lo C X. 

2. If Xi is given, then X C pow(X), in which pow denotes the power set. 
Again, a typical situation is X C pow(x); however, it might be known that 
certain subsets of X are impossible as initial conditions. 

3. Finally, if P(x) is given, then Xo C V(X) in which V(x) is the set of all 
probability distributions over X. 

The information space at stage k can be expressed as 



Thus, each r] k G T k yields an initial condition, action history, and observation 
history. 

It will be convenient to consider information spaces that do not depend on k. 
This will be defined by simply taking a union. If there are K stages, then the 
information space, X, is 



If the number of stages is not fixed, then X is defined to be the union of X k over all 
k G N. The situation is similar the state space obtained for time-varying motion 
planning in Section 7.1. The information space is natually time-dependent because 
information accumulated over time. In our discrete model, the reference to time 
is only implicit through the use of stages. Therefore, stage-dependent information 
spaces were defined. Taking the union of all of these is similar to what the state 
space was formed in Section 7.1 by making time one axis of the state space. For 
the information space, X, the stage index, k, can be imagined as an "axis". 

One immediate concern regarding the information space, X, is that the in- 
formation states may be arbitrarily long because the history grows linearly with 
the number of stages. For now, it is helpful to simply imagine X abstractly as 
another kind of state space, without paying close attention to how complicated 
each 7] G X may be to represent. In many contexts, there exist ways to simplify 
the information state representation. This will be the topic of Section 11.2. 

11.1.3 Defining a Planning Problem 

Now that the information space has been defined, in many ways it can be con- 
sidered as another kind of state space; however, it is important to keep in mind 
that the information space was derived from another state space for which perfect 
state observations could not be obtained. This next task is to define planning 
problems on the information space. 

In Section ??, a feedback plan was defined as a function of the state. Here a 
feedback plan is instead a function of the information state. Decisions cannot be 



X fe = X x U k ' 1 x Y k . 



(11.16) 



X = X 1 UX 2 U---UX K . 
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based on the state because it will be generally unknown during execution of the 
plan. However, the infromation state is always known. Therefore, it is logical to 
based decisions on the information state. 

Let 71k denote a K-step information-feedback plan, which is a sequence (tti, 
7r 2 , . . ., 71k) of K functions, rc k : T k — > U . Thus, at every stage, k, the information 
state rjk G Ik is used as a basis for choosing the action Uk = 7Tk(Vk)- Due to 
interference of nature through both the state transition equation and the sensor 
mapping, the action sequence, (u±, . . . ,Uk) produced by a plan, ttk, will not be 
known until the plan terminates. 

Just as in Formulation 2.4.2, it will be convenient to assume that U contains 
a termination action, ut- If «t is applied to rjk, at stage k, then ut is repeatedly 
applied forever. It is assumed once again that the state, Xk, remains fixed after 
the termination condition is applied. Remember, however, the Xk is still unknown 
in general; it becomes fixed and unknown. Technically, based on the definition of 
information spaces, the information state must change after w T is applied because 
the history grows. These information states can be ignored, however, because 
no new decisions are made after ut is applied. A plan that uses a termination 
condition can be specified as it = (tti, tx-i, . . .), because the number of stages may 
vary each time the plan is executed. 

We are almost ready to define the planning problem. This will require the 
specification of a cost functional. The cost will depend on the history, a, of states 
and actions, as in Section ??. The planning formulation involves the following 
components, summarizing most of the concepts introduced so far in Section 11.1: 

Formulation 11.1.1 (Discrete Information Space Planning) 

1. A nonempty state space, X, which is either finite or countably infinite. 

2. A finite action space, U. It is assumed that U contains a special termination 
action, which has the same effect as defined in Formulation 2.4.2. 

3. A finite nature action space, Q(x,u) for each x G X and u G U. 

4. A state transition equation, f, that produces a state, f(x,u,9) for every 
x G X, u G U , and 6 G Q(x, u). 

5. A finite or countably infinite observation space, Y . 

6. A finite nature observation action space, ty(x) for each x G X. 

7. An sensor mapping, h, which produces an observation, y = h(x, ip) for 
each x G X and ip G ^. This definition assumed a state-nature sensor 
mappings. A state sensor mapping or history-based sensor mapping, as 
defined in Section 11.1.1 may alternatively be used. 

8. A set of stages, each denoted by k, which begins at k — 1 and continues 
indefinitely. 
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9. An initial condition, i] , which is an element of an initial condition space, 

10. A goal set, X G C X. 

11. An information space, X, which is the union of the information spaces, X k = 
X x U k ~ l x Y k , for each stage k. 

12. Let L denote a real-valued, additive cost functional, which may be applied 
to any state-action history, a K = (x\, . . . , xk+i,U\, . . . , uk), to yield 

K 

L{o~ K ) = y^J(xk,u k ) + l F {x K+ i). (11.18) 
k=i 

If the termination action, w T , is applied at some stage k, then for all i > k, 

Ui = ut, Xi = Xk, and /(xj,w T ) = 0. 

Using Formulation 11.1.1, either a feasible or optimal planning problem can 
be defined. To obtain a feasible planning problem, let l(x k , u k ) = for all x k G X 
and u k G U , and let 

To obtain an optimal planning problem, then in general l(x k ,Uk) may assume any 
nonnegative, finite value. 

The Information Space is Just Another State Space It will become im- 
portant throughout this chapter and Chapter 12 to realize that in many ways the 
information space can be treated as an ordinary state space. It only seems special 
because it is itself derived from another state space, but once this is forgotten, 
it exhibits many properties of an ordinary state space in planning. One nice fea- 
ture is that the state in this new space is always known. Thus, by converting 
from an original state space to its information space, we also convert from having 
imperfect state information to always knowing the (information) state. 

One important consequence of this interpretation is that the state transition 
equation can be lifted into the information space to obtain an information transi- 
tion equation, fx- Suppose there are no nature actions. In this case, future states 
are predictable, which leads to 

Vk+i = fi(Vk,Uk)- (11.20) 

The function fx generates i] k+ i by concatenating u k and y k +i = h(x k+ i) = 
h(f(x k ,u k )) to r) k . If there are nature actions, 6 k and/or nature sensing actions 
ipk+i, then 

Vk+i = h{Vk,Uk,0k,ipk+i), (11-21) 
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which reflects the fact that future information states are unpredictable. Once 6 k 
and V'fc+i are chosen by nature, then rj k+ i is obtained by concatenating u k and 

y k+ i = h(x k+1 ,ip k+1 ) = h(f(u k ,x k ,9 k ),ip k+1 ) (11.22) 

to the history. Note, however, that even though nature causes future information 
states to be unpredictable, the current information state is always known. A plan, 
X — > U now seems like a state-feedback plan, if the information space is viewed 
as a state space. The transitions are all specified by fx- 

11.2 Alternative Representations of Information 
Spaces 

The information space in its original form appears to be quite complicated. Every 
information state corresponds to a history of actions and observations. The length 
of the information state vector unfortuntately grows linearly with the number of 
stages. This motivates many methods that try to reduce or simplify the repre- 
sentation of the information space in some way. In many applications, the ability 
to perform this simplification is critical to finding a practical solution. In some 
cases, the simplication preserves the structure of the original information space, 
meaning that completeness, and optimality if applicable, will not be lost be using 
the simpler representation. In other cases, we might be willing to tolerance a 
simplification that results in an approximation of the information space. Such an 
approach may be the only way to handle the most challenging problems. 

This section involves a substantial amount of notation. It is easy to become 
lost without frequent consideration of examples. Section ?? will present several 
detailed examples that illustrate the concepts presented in Sections 11.1 and 11.2. 
In this section, Example 11.2.1, which is very simple and less interesting, will be 
used to provide immediate illustration of some notation and concepts. 

11.2.1 Nondeterministic Derived Information States 

This section assumes that nature is modeled nondeterministically, which means 
that there is no information about what actions nature will choose, other than 
the actions will be chosen from and fy. Further assume that the state-action 
sensor mapping from Section 11.1.1 is used. Consider what inferences that may 
be drawn from an information state, rj k = (r]o,X k ). Since the model does not 
involve probabilities, suppose that i] represents a set X\ C X. Using the history, 
A*,, together with several components from Formulation 11.1.1, we can calculate a 
minimal subset of X in which x k is known to lie. Let X k {j] k ) refer to this subset, 
which will be referred to as a derived information state. It is always true that 
x k G X. Thus, it is important to make X k as small as possible by removing any 
states that are impossible values for x k . 
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Recall from (11.11), that for every observation, y k , a set H(y k ) C X, of possible 
values for x k , can be inferred. This could serve as a crude estimate of the derived 
information state. It is certainly known that X k (r] k ) C H(y k ); otherwise, the 
current state, x k , would not be consistent with the current sensor observation. If 
we carefully progress from the initial conditions, while applying constraints due 
to the state transition equation, the appropriate subset of H(yk) will be obtained. 

From the state transition equation, /, define a set-valued function, F, which 
yields a subset of X for every x G X and u G U as 

F(x, u) = {x G X I 36 G e(ar) for which x = f(x, u, 9)}. (11.23) 

Note that both F and H are set-valued functions that eliminate the direct ap- 
pearance of nature actions. The effect of nature is taken into account in the set 
that is obtained when these functions are applied. This will be very convenient 
for computing the derived information state. 

It will be convenient to generally use the notation X(-) as a subset of X that 
is derived using whatever information appears in the place of •. It may sometimes 
be denoted as X k (-) to additionally denote the particular stage, k. 

An inductive process will now be described that results in computing the 
derived information state, Xk(r)k), for any stage k. The base case, k — 1, of the 
induction proceeds as 

Xifai) =X 1 ( m ,y 1 ) =X 1 nH(x 1 ). (11.24) 

The first part of the equation replaces rji with (770,2/1), which is the long form of 
the information state. There are not yet and actions in the history. The second 
part applies set intersection to make consistent the two piece of information: 1) 
the initial state lies in Xi, which is the initial condition, and 2) the states in H(yi) 
are possible given the observation y\. 

Now assume inductively that the derived information state X k (rj k ) C X has 
been computed, and the task is to compute the derived information state, X k +i(r) k+ i). 
Recall that rj k+1 = (r) k , u k , yk+i)- Thus, the only new pieces of information are 
that Uk was applied and yk+i was observed. These will be considered one at a 
time. 

Consider computing X k+ i(r) k , Uk). If Xk was known, then after applying Uk, the 
state could lie anywhere within F(xk,Uk), using (11.23). Although Xk is actually 
not known, it is, however, known that Xk G X k (r) k ). Therefore, 

X k+ i(r] k ,u k ) = (J F(x k ,u k ). (11.25) 

This can be considered as the set of all states that can be reached by starting 
from some state in X k (r) k ), and applying actions u k G U and 6 k G Q(x k ). 

The next step is to take into account the observation y k +i- This information 
alone indicates that Xk+i lies in H(y k+ i). Therefore, an intersection is peformed 
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to obtain the derived information state, 



Xk+i(r)k+i) — X k+ i(r]k,Uk,yk+i) 



X k+1 (rjk,u k ) n H(y k+1 ). 



(11.26) 



Now that it has been shown how to compute X k+1 (i] k+ i) from X k (r) k ). After 
starting with (11.24), the derived information states at any stage can be computed 
by iterating (11.25) and (11.26) as many times as necessary. 

Because the derived information state is always a subset of X, a derived infor- 
mation space, denoted by X°, can be defined as 1° = pow(X). If X is finite, then 
X° is also finite, which was not the case with X because the histories continued 
to grow with the number of stages. Thus, if the number of stages is unbounded 
or large in comparison to the size of X, then derived information states seem 
preferable. It is also convenient that in X° there does not need to be an explicit 
reference to stages. It truly appears to be the appropriate "state space" for the 
problem. For the planning problem, the goal region, X G , can be expressed directly 
as a derived information state. In this way, the planning task is to terminate in 
a derived information state X K for which X K C X G . The history does not even 
have to be explicitly maintained. All computations can be performed directly in 
terms of derived information states. 

The following example is not very interesting in itself, but it is simple enough 
to illustrate the concepts introduced so far. 

Example 11.2.1 (Three-State Example) Consider the following components: 

1. A state space, X = {0, 1,2}. 

2. An action space, U = { — 1,0, 1}. 

3. A nature action space, Q(x) = {0, 1} for all iG0. 

4. A state transition equation f(x,u, 6) = (x + u + 6) mod 3. 

5. An observation space, Y = {0, 1, 2, 3, 4}. 

6. A nature observation action space, ty(x) = {0, 1,2} for all x G X. 

7. A sensor mapping, y = h(x, iji) — x + ip. 

The original information state representation based on histories appears very cum- 
bersome for this example, which only involves three states. The derived informa- 
tion space for this example is 



which is the power set of X = {0, 1, 2}. Note, however, that the emptyset, 0, can 
usually be deleted from X . 1 Suppose that the initial condition is X 1 = {0,2}, 

^^One notable execption is in the theory of nondeterministic finite automata, in which it is 
possible that all copies of the machine die, and there is no possible current state [711]. 



X° = {0, {0}, {1}, {2}, {0, 1},{1, 2}, {0, 2}, {0, 1,2}}, 



(11.27) 
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and that the initial state is x\ = 0. The initial state is unknown to the decision 
maker, but it is needed to make an example because we need to make sure that 
valid observations will be made. 

Now consider the execution over a number of stages. Suppose that the first 
observation, y 1 , is received as y\ = 2. Based on the sensor mapping, H(yi) = 
H(3) = {1,2,3}, which is not very helpful since H(3) = X. Applying (11.24) 
yields Xi{j]i) = {0,2}. Now suppose that the decision maker applies the action 
ui = 1, and nature applies 9± — 1. Using /, this yields x 2 = 2. The decision 
maker does not know 6 1 , and must therefore take into account any nature action 
that could have been applied. It uses (11.26) to infer that 

X 2 (vi,u 1 ) = F(0, 0) U F(0, 1) = {0, 1} U {1, 2} = {0, 1, 2}. (11.28) 

Now suppose that y 2 = 3. From the sensor mapping, H(3) = {1,2}. Applying 
(11.26) yields 

X 2 ( V2 ) = X 2 ( Vl , Ul ) n H(y 2 ) = {1, 2, 3} n {1, 2} = {1, 2}. (11.29) 

This process may be repeated for as many stages as desired. It can be seen that 
a path generated through X° be visiting a sequence of derived information states. 
Note that if the observation y k = 4 is every received, the state, x k , will become 
immediately known because H(A) = {2}. ■ 

Is the derived information space, X°, equivalent in some way to the origi- 
nal information space, X? The derived information space appears to be simpler; 
therefore, it seems that some information was lost. The construction of X° was ob- 
tained by mapping information states, rjk to derived information states, Xk(r)k)- It 
is certainly possible that many information states could map to the same derived 
information state. When using the derived information space, it is important to 
answer the following question: 

For the purposes of decision making, it is sufficient to know the set of possible 
states, or is it important to additionally know what history led to this set of possible 
states? 

The answer to this question is usually no. If it is known that x k lies within 
a particular subset of X, given by the derived information state, there is nothing 
else to learn from the history of how the subset was derived. Note that it is 
generally impossible to recover the history from a derived information state. 

11.2.2 Probabilistic Derived Information States 

If nature is modeled probabilistically, it turns out that the derived information 
states can be determined once again. In this case, each derived information state 
is a probability distribution, as opposed to a set. The set union and intersection 
of (??) and (??) are replaced by in this section by marginalization and Bayes 
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rule, respectively. In a sense, these are the probabilistic equivalents of union and 
intersection. It will be very helpful to compare the expressions from this section 
to those of Section 11.2.1. Most expressions in this section of the form P(x k \-), 
will have an equivalent expression in Section 11.2.1 of the form X k (-). 

The first step is to use make probabilistic versions of H and F . These are 
P(x k \y k ) and P(x k+ i\x k , u k ). The latter term was given in Section ??. To obtain 
P(x k \y k ), recall from Section ?? that P(y k \x k ) can be easily derived from P(ip k \x k ). 
To obtain P(x k \y k ), Bayes rule can be applied. Recall from basic probability 
theory that 

P(x k ,y k ) = P(x k \y k )P(y k ) = P(y k \x k )P(x k ). (11.30) 
Solving for P(x k \y k ) yields 

P(x k \y k ) = P(y t; )P(xfc) = P( ^ fc)P( * fc) . (11.31) 
ny k ) P(y k \x k )P(x k ) 

x k £X 

In the last step, P(y k ), was rewritten using marginalization. Note that in this 
case x k appears as the sum index; therefore, the denominator is only a function 
of y k , as required. Bayes rule requires knowing the prior, P(x k ). In the coming 
derivation, this will be replaced by a derived information state. 

Next consider defining derived information states for the probabilistic case. 
Each state is a probability distribution over X, and can be written as P(x k \rj k ), 
if derived from r\ k . The initial condition produces P{x\). Once again, derived 
information states can be computed inductively. For the base case, the only 
new piece of information is y\. Thus, the derived information state, P{x\\r\\), is 
P(xi\yi). This is computed by letting k — 1 in (11.31) to yield 



j>W(*> . (11.32) 



771) = P(xi|yi) = — — — — — — ■ 

P(yi\x 1 )P(x 1 ) 

Now consider the inductive step by assuming that P(x k \rj k ) is given. The task 
is to determine P(x k+ i\r) k+ i), which is equivalent to P(x k+ i\r) k , u k , y k +i)- Just as 
in Section 11.2.1, this will proceed in two parts by first considering the effect of 
u k , followed by y k+ \. The first step is to determine P(x k+ i\r) k , u k ) from P(x k \r) k ). 
First, note that 

P(x k+1 \r} k ,x k ,u k ) = P{x k+ i\x k ,u k ) (11.33) 

because r] k does contain any additional information regarding the prediction of 
x k+ i since x k is given. Marginilization from probability theory, can be used to 
eliminate x k from P(x k+ i\x k , u k ). This must be eliminated because it is not given. 
Putting these steps together yields 

P(x k+1 \rj k: u k ) = ^ P(x k+1 \x k ,u k ,r] k )P(x k \r] k ) = ^ P(x k+1 \x k ,u k )P(x k \r] k ), 

x^ax x^Gi 

(11.34) 
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which expresses P(x k+ i\r] k , u k ) in terms of given quantities. 

The next step is to take into account the observation, yk+i- This is accom- 
plished by making a version of (11.31) that is conditioned on the information 
accumulated so far: r\ k and u k . Also, k is replaced with k + 1. The result is 

P{x k+ i\y k+ i,T] k ,u k ) = — ^ . (11.35) 

2 , P{yk+i\xk+i, Vk, u k )P(x k+ i \qk, u k ) 

x k+1 £X 

The left side of (11.35) is equivalent to P(x k+ \\r] k+ i), which is the derived infor- 
mation state for stage k + 1, as desired. There are two different kinds of terms on 
the right. The expression for P(x k+ i\r] k , u k ) was given in (11.34). Therefore, the 
only remaining term to calculate is P(y k +i\x k+ i,r] k ,u k ). Note that 

P(yk+i\x k +i,Vk,u k ) = P(y k+1 \x k+1 ) (11.36) 

because the sensing mapping depends only on the state (and the probsbility model 
for the nature observation action, which also depends only on the state). Since 
P{y k+ i\x k+ i) is specified as part of the sensor model, we are finished deriving the 
computation of P(x k+ i\r] k+1 ) from P(x k \r/ k ). 

For the probabilistic case, the derived information space, X°, is the set, V(X), 
of all probability distributions over X. Again, the planning problem can be ex- 
pressed entirely in terms of the derived information space, instead of maintaining 
histories. A goal region can be specified as constraints on the probabilities. For 
example, for some particular x G X, the goal might be to reach any derived 
information state for which P(x\r] k ) > 1/2. 




Figure 11.3: The probabilistic derived information space for the three-state ex- 
ample is a 2-simplex embedded in M 3 . 



Example 11.2.2 (Three-State Example Revisited) Now return to Example 
11.2.1, but this time use probabilistic models. For a derived information state, let 
Pi denote the probability that the current state is i G X. The derived information 
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state can be expressed as (po,pi,P2) £ R 3 - This implies that the information space 
can be nicely embedded in R 3 . By the axioms of probability, po + pi + P2 = 1, 
which in R 3 can be interpreted as a plane that slices diangonally through the 
origin. This restricts the X° to a two-dimensional set. Also following the axioms 
of probability for each % e {0, 1, 2}, < Pi < 1. This means that X° is restricted 
to a triangular region in R 3 . The vertices of this triangular region are (0,0,1), 
(0, 1, 0), and (1, 0, 0); these corresponds to the three different ways to have perfect 
state information. In a sense, the distance away from these points corresponds 
to the amount of uncertainty in the state. The uniform probability distribution 
(1/3, 1/3, 1/3) is equidistant from the three vertices. A projection of the triangular 
region into R 2 is shown in Figure 11.3. The interpretation in this case is that p\ 
and P2 give a point in IR 2 , and p% is automatically determined from p^ — 1 —p\ —p2- 
The triangular region in R 3 corresponds to an uncountably infinite set, even 
though the original information space is countably infinite for a fixed initial con- 
dition. This may seem strange, but there is no problem because for a fixed initial 
condition, it is generally impossible to reach all of the points in V(X). If the 
initial condition allows any point in V(X), then all of the derived information 
space is covered. 

NEED TO SHOW A COUPLE OF STEPS OF THE COMPUTATIONS. ■ 



11.2.3 Collapsing the Information Space 

The mappings from X to X°, which were presented in Sections 11.2.1 and 11.2.2, 
are special cases of a very general and powerful principle called collapsing. For 
a given problem, there are numerous possible mappings that can be developed 
to further reduce the size of the information space. The general idea is to map 
the original information space to a smaller space, but to ensure that whenever a 
successful plan exists over the original space, one will also exist over the smaller 
space. This idea will now be formalized. 

Let $ : X — > X c denote a surjective (onto) mapping from an information space, 
X, to a collapsed information space, X c . Usually, X c is selected to be as small as 
possible while ensuring that satisfactory plans still exist. To make this precise, 
some definitions are needed to relate the set of possible plans over X to the plans 
over X c , which is generally considered to be smaller. For a given information space, 
X, let n(X) denote the set all possible plans, 7r : X — > U. This notation can also 
be applied to derived and collapsed information states, to yield n(X°) and n(X c ), 
respectively. 

One must be very careful in designing $ because it may potentially destroy 
possible solutions. In the worst case, $ can map X to a set X c that contains only 
one state. Clearly this is a bad idea. Let X c = {r] }. The set of all plans of the 
form X c — > U is dramatically reduced. There are only \U\ possible plans, each of 
which applies a fixed u £ U over all stages. The set, n(X c ), of all plans over X c 
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can generally be considered as subset of II (X), defined as 

n$(J) = {vr G n(J) I 37r' G n(J c ) such that V?? G J, tt(t7) = 7r / ($(^))}. (11.37) 

In words this means that any plan in I1(X) can be represented as a plan in 11$ (X c ). 
In general, there will be many information states in X that map to a single infor- 
mation state in X c . For these information states, the action, 7i(r]) must remain 
fixed. 

A useful way to interpret 11$ (X) is obtained by considering the partition of X 
that is induced by $ as follows. For each rj G X c , let <3> _1 (?7) denote the set of 
rj' G X for which $(?/) — V- By constructing this set for each rj G X c , a partition 
of X is formed. For a plan to lie in 11$ (X), it must hold a constant value over each 
set in the partition. In the extreme case in which X c contains only one element, 
the partition contains one set, X itself. In the other extreme, $ is the identity 
mapping; in this case, the partition contains only singleton elements, one for each 
rj G X. Intuitively, the partition represents the "resolution" over which a plan 
can be defined. The two given extremes represent the lowest and highest possible 
resolutions. Once $ is selected, every plan must be adapted to X c by keeping a 
fixed value over each set in the induced partition. 

The main concern when selecting $ is that the restriction to 11$ (X) does not 
severely limit the quality of solutions that can be produced. In the context of 
feasible planning, one must ensure that $ does not destroy feasibility. If a feasible 
plan exists in I1(X), then one must also exist in 11$ (X). This condition might 
be required to hold for all initial conditions, rj G X , or maybe $ is designed to 
preserve feasibility for a particular initial condition. For problems that involve 
optimality, we may require that among the set of optimal plans in 11$ (X), at least 
one must lie in 11$ (X) . This requirement could also be weakened by requiring that 
only an approximately-optimal plan exist in 11$ (X). 

It is important to be aware that a plan in 11$ (X) might not behave the same way 
as the corresponding plan in I1(X C ). This can be seen by recalling the information 
transition equation, fx, from (11.20) and (11.21). Once $ is applied, then the 
plan causes transitions to occur over the collapsed information space. Suppose for 
illustration purposes that there are no nature actions, which yields (11.20). Let 
r] k G X denote an information state in the original information space. According 
to (11.20), applying some u k yields rj^+i = fx{Vk, u k) on the original information 
space. Let r( k G X c denote the collapsed information state for which i]' k = Q(r)k). 
If the same action, Uk, is applied using the collapsed information space, then 
Vk+i = fi(jl'ki u k) is obtained. The problem is that r]' k+1 might not be the same 
as $(% +1 ). Algebraically, this means that fx and $ generally do not commute. 
Applying fx and then $ to an information state f]k is not necessarily equivalent 
to appling <3> and then fx- This problem will be illustrated in Example 11.3.3. 

The derived information states represent ideal examples of collapsing the in- 
formation space. Using X° instead of X preserves feasibility and optimality for 
virtually any planning problem. For particular problems, however, it may be pos- 
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sible to obtain much smaller information spaces that also preserve these properties. 
An interesting example of this is given in Section ??. 

11.2.4 Limited Memory Models 

One general way to reduce the size of the information states is to limit the amount 
of memory. Except in special cases, this usually does not preserve the feasibility 
or optimality of the original problem. Nevertheless, such models are very useful 
in practice when there appears to be no other way to reduce the size of the 
information space. Furthermore, these models occasionally do preserve the desired 
properties of feasibility, and maybe also optimality. 

Previous % stages Under this model, the history is truncated. Any actions or 
observations received earlier than % stages ago are dropped from memory. This 
yields an information state defined as 

r] k = (u k -i, . . . . . .,y k ), (11.38) 

assume that i > and k > i. If i < k, then the information state is defined in 
the usual way, given by (11.13). In general, the action and observation histories 
could be truncated at different stages. The advantage of this approach, if it leads 
to a solution, is that the length of the information state no longer grows with the 
dimension of the space. If X and U are finite, then the information space will also 
be finite, even without using derived information states. 

Sensor feedback An interesting case is obtained by removing all but the last 
sensor observation from the information state. This yields i] k = y k , which is 
referred to as sensor feedback. In this case, all decisions are made directly in 
terms of the sensor reading. A plan, n, can therefore be considered as a mapping: 
7r : Y — > U. In some contexts, this may be referred to as a purely reactive 
plan. There are generally many problems whcih have solutions when information 
spaces are used, but there exist no solutions that use sensor feedback. However, 
it may be worth determining whether such a solution exists. Such solutions tend 
to be simpler to implement in practice. Certainly, if a sensor-feedback exists for 
a problem, and feasibility is the only concern, then it is pointless to design and 
implement a plan in terms of the entire information space. 
EXAMPLE? 

11.3 Examples for Discrete State Spaces 

11.3.1 Basic Nondeterministic Examples 

First, consider a simple example that uses the sign sensor of Example 11.1.3. 
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Example 11.3.1 (Using the Sign Sensor) Let X = Z, U = {-1, l,u T }, Y = 
{ — 1,0,1}, and y = h(x) = sgnx. For the state transition equation, x k +i = 
f(xk,Uk) = %k + Uk- There are no nature actions that interfere with the state 
transition equation or the sensor mapping. Therefore, future information states 
are predictable. The information transition equation, fx, is r] k+ i = fx(Vk, u k)- 
Suppose that initially, r] = X, which means that any initial state is possible. The 
goal is to reach and terminate at G X. 

An information state at stage k appears as: 

rj k = (X,u 1 ,...,u k _ 1 ,y 1 ,...,y k ). (11.39) 

A typical value appears as r] 5 = (X, —1,1,1,-1, 1,1, 1,1,0). Using the derived 
information space, X°, from Section 11.2.1, 1° = pow(X), which is uncountably 
infinite. By looking carefully at the problem, however, it can be seen that most 
of the derived infromation states are not reachable. If y k = 0, it is known that 
x k = 0; hence, r] k = {0}. If y k — 1, it will always be the case that i] k = {1,2,...}. 
If y k = —1, then r\ k — {. . . , —2, —1}. From this, a plan, n, can be specified over 
these three derived information states. For the first one, ir(r] k ) = u T - For the other 
two, n(r] k ) = —1 and n(r] k ) = 1, respectively. Based on the sign, the plan tries 
to move towards 0. If different initial conditions are allowed, then more derived 
information states can be reached, but this was not required as the problem was 
defined. Note that optimal-length solutions are produced by the plan. ■ 

The next example provides a simple illustration of solving a problem without 
ever knowing the exact state. This leads to the goal recognizability problem [?]. 

Example 11.3.2 (Goal Regonizability) Let X = Z, U = {— 1,1, ut}, and 
Y — Z. For the state transition equation, x k +i = f{x k ,u k ) = x k + u k . Now 
suppose that for sensing, a variant of Example 11.1.7, sensor disturbance is used, 
y = h(x, ip), and ^ = {—5, . . . , 5}. Suppose that once again, r] = X. In this case, 
it is possible to guarantee that a goal, Xq = {0}, is reached because of the goal 
recognizability problem. The disturbance in the sensor mapping does not allow 
precise enough state measurements to deduce the precise goal state. If the goal 
region, Xq is enlargened to {—5, 5}, then the problem can be solved. Due to the 
disturbance, the derived information state will always be a subset of consequtive 
sequencec of 11 states. It is simple to derive a plan that moves this interval until 
the derived information state becomes a subset of Xq- When this occurs, then the 
plan applies ut- In solving this problem, the exact state never had to be known. ■ 

The problem shown in Figure 11.4 will serve two purposes. First, it is an 
example of sensorless planning, which means that there are no observations. This 
is an interesting class of problems because it appears that no information can 
be gained regarding the state. Counterintuitively, it turns out for this example 
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and many others that the plans can be designed that estimate the state. The 
second purpose is to illustrate how the information space can be dramatically 
collapsed using the concepts of Section 11.2.3. The derived information space for 
this example initially contains 2 19 states, but it can be nicely collapsed to a small 
number of states for planning purposes. 

i— » 
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Figure 11.4: An example that involves 19 states. There are no sensor observations; 
however, actions can be chosen that that enable the state to be estimated. The 
example provides an illustration of collapsing the information space. 



Example 11.3.3 (Moving in an L-shaped Corridor) State state space, X, 
for example shown in Figure 11.4 has 19 states, each of which corresponds to a 
location on one of the white tiles. For convenience, let each state be denoted by 
{i, j). There are 10 bottom states, denoted by (1, 1), (2, 1), . . ., (10, 1), and 10 left 
states, denoted by (1,1), (1,2), . . ., (1,10). Since (1,1) is both a bottom state 
and a left state, it will be called the corner state. 

It is assumed for this problem that there are no sensor observations. Nature, 
however, interferes with the state transitions, which leads to a form of nondeter- 
ministic uncertainty. If we try to apply an action that takes one step, nature may 
cause two or three steps to be taken, if possible. This can be modeled as follows. 
Let 

U = {(1,0), (-1,0), (0,1), (0,-1)} (11.40) 

and let = {1,2,3}. The state transition equation equation is defined as 
f(x,u,9) = x + 6u, unless it is possible to move to the required location. For 
example, if x = (5,1), u = (—1,0), and = 2, then the resulting next state is 
(5, 1) + 2(— 1, 0) = (3, 1). If it is not possible to move to the state x + 6u, then 
the state remains fixed, f(x, u, 6) = x. 
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It is assumed for this problem that there are no sensor observations. Therefore, 
the information state at stage k is 

rik = (u 1 ,...,u k - 1 ). (11-41) 

Now use the derived information space, 1° = pow(X). The initial state, x\ = 
(10,1) is given, which means that the initial information state, rji, is {(10,1}. 
The goal is to arrive at the information state, {(1, 10)}. This means that the task 
is to design a plan that moves from the lower right to the upper left. 

With perfect information, this would be trivial; however, without sensors 
the uncertainty may grow very quickly. For example, after applying the ac- 
tion U\ = (—1,0) from the initial state, the derived information state becomes 
{(7, 1), (8, 1), (9, 1)}. After u 2 = (-1, 0) it becomes {(4, 1), ... , (8, 1)}. A nice 
feature of this problem, however, is that uncertainty can be reduced without sens- 
ing. Suppose that for 100 stages, we continue to apply u k = (—1, 0). What is the 
resulting information state? As the corner state is approached, the uncertainty is 
reduced because the state cannot be further changed by nature. It is known that 
each action, u k = (—1,0), decreases the X coordinate by at least one each time. 
Therefore, after 9 or more stages, it is known that r\ k = {(1,1)}. Once this is 
known, then the action (0, 1) can be applied. This will again increase uncertainty 
as the state moves through the set of left states. If (0, 1) is applied 9 or more 
times, then it is known for certain that x k = (1,10), which is the require goal 
state. 

A successful plan has now been obtained: 1) apply (—1, 0) for 9 stages, 2) then 
apply (0,1) for 9 stages. Recall from Section 11.1.3 that a strategy is generally 
specified as tt : X — > U; however, for this example, it appears that only a sequence 
of actions is needed. The actions do not depend on the information state. Why 
did this happen? If no observations are obtained during execution, then there 
is no way to use feedback. There is nothing to learn by executing the plan. In 
general, for problems that involve no sensors and a fixed initial information state, 
a path in the information space can be derived from a plan. It is somewhat strange 
that this path is completely predictable, even though the original problem may 
involve substantial uncertainties. We always know precisely what will happen in 
terms of the information states. 

To make the situation more interesting, assume that any subset of X could be 
used as the initial condition. In this plan 7r : X — > U must be formulated 

to solve the problem. From each iniital information state 77, a path in X° can still 
be computed from n. Specifying a plan over all of X° appears quite complicated, 
which motivates the next consideration. 

The ideas from Section 11.2.3 can be applied here to collapse the information 
down from 2 19 (over half of a billion) to 3. The mapping $X — > X c must be 
constructed. We have already mapped X to X° by using derived information 
states; therefore, the collapsed information space will be obtained by defining 
$ : X° — > X c . We first make a naive attempt to collapse the information state 
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down to only three states. This illustrates the issue mentioned near the end of 
Section 11.2.3. Let X c = {g, I, a}, in which g denotes "goal", / denotes "left", and 
a denotes "any". The mapping is 



It might seem that this collapsed information space will lead to a very compact 
plan for solving the problem. Based on the successful plan described so far, the 
plan on X c can be defined as n(g) = w T , 7r(Z) = (0, 1), and n(a) = (—1, 0). What 
is wrong with this? Suppose that the initial state is (10, 1). There is no way to 
require that = (—1,0) is applied 9 times to reach the / state. If (—1,0) is 
applied to the a state, then it is not possible to determine when the transition to 
I will occur. 

Now consider a different collapsed information space. Suppose that are 19 
collapsed information states, which includes g as defined previously, U for 1 < 
i < 9, and for 2 < i < 10. The mapping $ is defined as $(77) = g if rj — 
{(10, 1)}. Otherwise, $(77) = k, for the largest value of i such that rj is a subset of 
{(7, 1), . . . , (10, 1)}. If there is no such value for 7, then $(77) = a^, for the smallest 
value of 7 such that 77 is a subset of {(1, 1), . . . , (1, 10), (2, 1), . . . , (7, 1)}. Now the 
plan may be defined as 7r(g) = u T , 7r(/j) = (0, 1), and 7r(aj) = (—1,0). Although 
it might not appear to be any better than the plan obtained from collapsing 
To to three states, the important difference is that the correct information state 
transitions occur. For example, if Uk = (—1, 0) is applied at a 5 , then 04 is obtained. 
If u = (—1, 0) is applied at a 2 , then li is obtained. From there, u = (0, 1) is applied 
to yield l 2 . These actions can be repeated until eventually Z 9 and g are reached. 



11.3.2 Nondeterministic Finite Automata 

An interesting connection lies between the ideas of this chapter and the theory of 
finite automata, which is part the theory of computation (see [339, 711]). In Sec- 
tion ??, it was mentioned that determining whether there exists some string that 
is accepted by a deterministic finite automaton (DFA) is equivalent to a discrete 
fesaible planning problem. If unpredictability is introduced into the model, then 
a nondeterministic finite automaton (NFA) is introduced, as depicted in Figure 
11.5. This represents one of the simplest examples of nondeterminism in theoreti- 
cal computer science. Such nondeterministic models in general serve as a powerful 
tool for defining models of computation and their associated complexity classes. 
It turns out that these models give rise to interesting examples of information 
spaces. 

A nondeterministic finite automaton (NFA) is typically described using a di- 
rected graph as shown in Figure ??.b, and is considered as a special kind of finite 




(11.42) 
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a. b. 



Figure 11.5: a) An NFA is a state machine that reads an input string and decides 
whether or not to accept it. b) A graphical depiction of a nondeterministic finite 
automaton (NFA). 



state machine. Each vertex of the graph represents a state, and edges represent 
possible transitions. An input string of finite length is read by the machine. For 
the example, the input string is assumed to be a binary sequence that consists of 
Os and Is. The initial state is designed be an inward arrow that has no source 
vertex, as shown pointing into state a in Figure 11. h. The machine starts in this 
state and reads the first symbol of the input string. Based on its value, it makes 
appropriate transitions. For a deterministic finite automaton (DFA), the next 
state must be specified for each of the two inputs, and 1, from each state. From 
state in an NFA, there may be any number of outgoing edges (including none) 
that represent the response to a single input. For example, there are two outgoing 
edges if is read from state c (the arrow from c to b actually corresponds to two 
directed edges, one for and the other for 1). There are also edges designated 
with a special e symbol. If a state has an outgoing e, the state may immediately 
transition along the edge without reading another symbol. This may be iterated 
any number of times, for any outgoing e edges that may be countered, without 
reading the next input symbol. The nondeterminism arises from the fact that 
there are multiple choices for possible next states due to multiple edges for the 
same input and e transitions. There is no sensor that indicates which state is ac- 
tually chosen. The interpretation in the theory of computation is that when there 
are multiple choices, the machine clones itself, and one copy runs each choice. It 
is like having multiple universes in which each different possible action of nature is 
occuring simultaneously. If there are no outgoing edges for a certain combination 
of state and input, then the clone dies. Any states that are a double boundary, 
such as state a in Figure 11.5, indicate accept states. When the input string ends, 
the NFA is said to accept the input string if there exists at least one alternate 
universe in which the final machine state is an accept state. 

The formulation usually given for NFAs seems very close to Formulation ??, 
for discrete feasible planning. Here is a typical NFA formulation [711], which 
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formalizes the ideas depicted in Figure 11.5: 

Formulation 11.3.1 (Nondeterministic Finite Automaton) 

1. A finite state space, X. 

2. A finite alphabet, E, which represents the possible input symbols. Let E e = 

su{ e }. 

3. A transition function, 5 : X x E e — > pow(X). For each state and symbol, a 
set of outgoing edges is specified by indicating the states that are reached. 

4. A start state, x Q G X. 

5. A set, A C X of accept states. 

Example 11.3.4 (Three-State NFA) The example in Figure 11.5 can be ex- 
pressed using Formulation 11.3.1. The components are X = {a,b,c}, E = {0, 1}, 
E e = {0, l,e}, xo = a, and A = {a}. The state transition equation requires the 
specification of a state for every x G X and symbol in E e : 
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(11.43) 



Now consider reformulating the NFA and its acceptance of strings as a kind of 
planning problem. An input string can be considered as a plan that uses no form 
of feedback; it is fixed sequence of actions. The planning problem is to determine 
whether a string exists that is accepted by the NFA. Because there is no feedback, 
there is no sensing model. The initial state is known, but subsequente states can- 
not be measured. The history at stage k reduces to \ k = U k ~ l = (u 1: . . . ,u k -i), 
the sequence actions that have been applied so far. The nondeterminism can be 
accounted for by defining nature actions that interfere with state transitions. This 
results in the following formulation, which is described in terms of Formulation 
11.3.1: 

Formulation 11.3.2 (An NFA Planning Problem) 

1. A finite state space, X. 

2. An action space U — E U {«t}. 

3. A state transition function, F : X x U — > pow(X). For each state and 
symbol, a set of outgoing edges is specified by indicating the states that are 
reached. 
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4. An initial state, xq = Xi 



5. A set, X g = A of goal states. 



The information space, X, is defined using 
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(11.44) 



for each k e N, and taking the union as defined in (11.17). It is assumed that 
the initial state of the NFA is always fixed; therefore, X does not appear in the 
definition of X. Because there is no feedback, a plan, ir, is just a sequence of 
actions, as defined for the problems in Chapter 2. 

For expressing the planning task, it is best to use the derived information 
space, X° = pow(X), from Section 11.2.1. Thus, each information state, X G X° 
is a subset of X which corresponds to the possible current states of the machine. 
The initial condition could be any subset of X because e transitions can occur 
from Xi. Subsequent derived information states follow directly from F. The task 
is to compute a plan of the form 



which results in an information state i]k+i £ X° for which t]k+i H X g ^ 0. This 
means that at least one possible state of the NFA must lie in X g after the termi- 
nation action is applied. This condition is much weaker than a typical planning 
requirement. Using worst-case analysis, a typical requirement would be that every 
possible NFA state lies in X g . 

The problem given in Formulation 11.3.2 does not precisely a specialization 
of Formulation ?? because of the state transition function. For convenience, F 
was directly defined, instead of explicitly requiring that / is defined in terms of 
nature actions, Q(x,u), which in this context depend on both x and u for an 
NFA. There is one other small issue regarding this formulation. In the planning 
problems considered in this book, it is always assumed that there is a current 
state. For an NFA, it was already mentioned that if there are no outgoing edges 
for a certain input, then the clone of the machine dies. This means that potential 
current state ceases to exist. It is even possible that every clone dies, which leaves 
no current state for the machine. This can be easily enabled by directly defining 
F; however, planning problems must always have a current state. To resolve this 
issue, we could augment X in Formulation 11.3.2 to include an extra dead state, 
which signifies the death of a clone when there are no outgoing edges. A dead 
state can never lie in X g , and once a transition to a dead state occurs, the state 
remains dead for all time. In this section, the state space will not be augmented 
in this way; however, it is important to note that the formulation can easily be 
made consistent with Formulation 11.3.2. 

The planning model can now be compared to the standard use of NFAs in the 
theory of computation. A language of an NFA is defined to the set of all input 



7T = (ui,U 2 , • • .,U K ,U T ,U T , . . .), 



(11.45) 
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strings that it accepts. The planning problem formulated here determines whether 
there exists a string (which is a plan that ends with termination actions) that is 
accepted by the NFA. Equivalently, a planning algorithm determines whether or 
not the language of an NFA is empty. Constructing the set of all successful plans 
is equivalent to determining the language of the NFA. 

Example 11.3.5 (Planning for the Three-State NFA) The example in Fig- 
ure 11.5 can be expressed using Formulation 11.3.1. The components are X = 
{a, b, c}, £ = {0, 1}, S e = {0, 1, e}, xq = a, and F = {a}. The function F(x, u) is 
defined as 
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The derived information space is 

J° = {0, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}} (11.47) 

in which the initial condition is i]q = {a, b} because an e transition occurs imme- 
diately from a. An example plan that solves the problem is (1, 0, 0,ut, ■ • •)• This 
corresponds to sending an input string 110 through the NFA depicted in Figure 
11.5. The sequence of information states obtained during the execution of the 
plan is 

{a, b} ^ {c} A {b, c} A {a, b, c} ^ {a, b, c}. (11.48) 



A basic theorem from finite automata states that for the set of strings accepted 
by an NFA, there exists a DFA (deterministic) that accepts the same set [711]. 
This is proven by constructing a DFA directly from the derived information space. 
Each derived information state can be considered as a state of a DFA. Thus, the 
DFA has 2 n states, if the original NFA has n states. The state transitions of 
the DFA are derived directly from the transitions between derived information 
states. When an input (or action) is given, then a transition occurs from one 
subset of X to another. A corresponding transition is made between the two 
corresponding states in the DFA. This construction is an interesting example of 
how the information space is new state space that arises when the states of the 
original state space are unknown. Even though the information space is usually 
larger than the original state space, its states are always known. Therefore, the 
behavior appears the same as in the case of perfect state information. This idea 
is very general, and may be applied to many problems beyond DFAs and NFAs. 
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11.3.3 Probabilistic Examples 

POMDPs 

Exploring an (n — l)-simplex embedded in W 1 . 

Let the vertices be (0, 0, ... , 0, 1), (0, 0, . . . , 0, 1, 0), . . ., (1, 0, . . . , 0). 

Each point in the simplex corresponds to a probability distribution over X. 

It is specified by the bary centric coordinates. 

A Sensor Planning Problem Can the actions control the sensor? We must 
allow the case of actions to determine where to sense. 

Need a good example of goal recognizability and the termination problem. 

11.4 Continuous State Spaces 

This section takes many of the concepts that have been developed in Sections 
11.1 and 11.2, and generalizes them to continuous state spaces. This represents 
an important generalization because the configuration space concepts, on which 
motion planning was based in Part II, are all based on continuous state spaces. 
In this section, the state space might be a configuration space,, X = C, as de- 
fined in Chapter ??, or any other continuous state space. Because it may be a 
configuration space, many interesting problems can be drawn from robotics. 

During the presentation of the concepts of this section, it will be helpful to 
recall analogous concepts that were already developed for discrete state spaces. In 
many cases, the formulations appear indentical. In others, the continuous case is 
more complicated, but usually maintains some of the concepts from the discrete 
case. It will be seen after introducing many continuous sensing models in Section 

11.5 that many problems formulated in continuous spaces are even more elegant 
and easy to understand than their discrete counterparts. 

11.4.1 Discrete-Stage Information Spaces 

It is assumed here that there are discrete stages, k. Let X C W" 1 be an n- 
dimensional manifold, for n < m, called the state space. 2 Let Y C R m be an 
n^-dimensional manifold, for n y < m, called the observation space. For each 
x G X, let C X be an n n -dimensional manifold, for n n < m, called the set 
of nature observation actions. The three kinds of sensors mappings, h, defined in 
Section 11.1.1 are possible, to yield either state mapping, y = h(x), state-sensor 
mapping y = h(x, ip), or history-based, y = h(x±, . . . , x^, y). For the case of a state 
mapping, the preimages, H(y), once again induce a partition of X. Preimages 
can also be defined for state-action mappings, but they do not necessarily induce 
a partition of X. 

2 If you did not read Chapter 4, and are not familiar with manifold concepts, then assume 
X = R"; it will not make much differencncc. Make similar assumptions for Y and 
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Many interesting sensing models can be formulated in continuous state spaces. 
Section 11.5 provides a kind of sensor catalog. In general, there is once again 
the choice of nondeterminstic or probabilistic uncertainty if nature observation 
actions are used. If nondeterministic uncertainty is used, there is nothing more to 
define. Probabilistic models are defined in terms of a probability density function, 
p : * -> [0, oo). 3 

The information space definitions from Section 11.1.2 remain the same, with 
the understanding that all of the variables are continuous. Thus, (??) and (??) 
serve as the definitions of T k and J. Let U C W m be an n u -dimensional manifold 
for n u < m. For each x G X and u G U, let Q(x, u) be an n^-dimensional manifold 
for ne < m. A discrete-stage information space planning problem over continuous 
state spaces can be easily formulated by taking Formulation 11.1.1 and replacing 
each discrete variable by its continuous counterpart that uses the same notation. 
Therefore, the full formulation is not given. 

11.4.2 Continuous- Time Information Spaces 

Now assume that there is a continuum of stages. Most of the components of 
Section 11.4.1 remain the same. The spaces, X, Y, ^(x), U, Q(x,u), remain the 
same (REALLY???). The sensor mapping also remains the same. The main dif- 
ference occurs in the state transition equation. To specify it correctly in the most 
general form, differential equations are necessary. To make the modeling problem 
worse, expressing the effect of nature actions requires differential inequalities [] in 
the case of nondeterministic uncertainty, and stochastic differential equations [ 
in the probabilistic case. Both of these concepts are generalizations of differential 
equations that are well beyond the scope of this book. The ideas presented here 
can be generalized to these cases, once the appropriate technical considerations 
required for these advanced topics are resolved. 

The approach taken here is to assume a specialized formulation to avoid these 
technical difficulties. 

Need to avoid actions. This means that plans directly specify the state. Does 
this even make sense for state feedback? It is like the path is specified in a 
coordinate frame, but the true frame is not known... 

Let t denote time, t G T — [0, oo). 

Let U C M, m be the input space. 

Let u : [0, oo) — > U be called the input history. 

Let a state trajectory, x : [0, oo) — > X, denote a solution to the following 
system of n differential equations, 

^ = /(*(*),«(*)), 

in which / is a smooth mapping on X and U. 

3 We assume that all continuous spaces are measure spaces, and all p functions are measurable 
functions over these spaces. 
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Note: u(t) could be derived from a state-feedback mapping 7 : X — > U to 
obtain f(x(t), ^(xit))). 

We will not be able to do this because of the next topic... 
Sensor Model: 

Let Y C R fc be the sensor space. 

The sensor space models the set of possible instantaneous sensor readings. 
For each t E [0, 00), the sensor value y(t) is given by 

y(t) = h(x(t)), 

for some specified mapping h : X — > Y. 

Often, h is not injective, which causes information loss, (projection, fibration) 
Let y : [0, 00) — > Y be called the sensor history. 
The Information State: 

Let x t , -u t , and y t denote the restrictions of x, u, and y, respectively, to the 
domain [0,t]. 

The information state, r} t is given by i] t = (u t ,y t ). 
In other words, (input history, sensor history). 

Note: Many restricted forms are possible: limited memory, sensor less, no 
knowledge of inputs, etc. 

The information space, X, is the set of all possible information states. 

Remember that X is a function space that is determined once X, U, Y, f and 
h are given. 

Modeling Disturbances: 

Let "nature" interfere with motions and sensors. 

Let V C W, W C W, be disturbance spaces. Let v(t) E V and w(t) E W for 
all t E [0,oo). 

State transition equation: 

dx 

— = f(x(t),u(t),v(t)) 

Sensing model: 

y(t) = h(x(t),w(t)) 
The disturbances, v(t) and w(t), may be either 

1. Simply unknown, or 

2. Modeled random process. 
Examples of Unknown Disturbance 

Let B denote a disc of unit radius, centered at the origin of R 2 . 
Suppose X C R 2 , U = B, Y = R 2 , f(x(t),u(t)) = u(t), V = B, h(x(t)) = 
x{t)+v(t). 
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Suppose further that y is continuous. 

Consider possible rjt = yt (input history is assumed unknown): 
Case 1: A "big" hole in X (radius > 1) 




Case 2: A "small" hole in X (radius < 1) 




11.4.3 Alternative Representations 

Nondeterministic Case: 

Given an information state r] t and initial set of states Xq C X, derive a set 
F( Vt ,X )CX. 

F(rj t ,X ) represents a derived information state, the set of all possible x(t) G 
X, given r\ t and X . 

The derived information space is the set of all possible F that can be derived 
from rjt € T. 

Probabilistic Case: 

Given an information state T] t an initial probability measure, p(x(0)), derive a 
conditional probability measure, p(x(t)\rj t ). 

p(x(t)\rj t ) represents a derived information state in the probabilistic sense. 

The derived information space in this case is the set of all probability measures 
that can be derived from T)t G X. 
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11.4.4 Approximating Information States 

bounding volumes for nondeterministic uncertainty 
moments for probabilistic uncertainty 

11.5 Sensors for Continuous Spaces 

Many examples can be defined, most of which are for an oriented point in a 2D 
world, yielding q = (x, y, 9) and C = M 2 x S 1 . 

1. Perfect State Measurement: h(q) = q. 

2. Compass: Y = S 1 , h(q) = 9. A gyroscope is the 3D version-these work be 
precession, which is the effect that keeps bicycles from falling over. 

3. Positioning: Y = M 2 , h(q) = (x,y). Like GPS (but without orientation 
information). 

4. Contact: h(q) — 1 if q e <9C/ ree , and h(q) = if q G int(Cj ree ). 

5. Proximity: Like contact sensor, but triggers when within a specified range 
of the wall. 

6. Wheel Odometry: If accurate, it measures how far the robot has traveled. 
This is used for dead reckoning. 

7. Homing Beacon: The direction to the goal is known. H = S 1 , h(q) = 
atan2(x g — x, y g — y). This was used in bug algorithms, and also is popular 
in the competitive ratio framework in algorithms. Rather than the goal, 
beacons may be placed anywhere. The could be individually coded, or 
confusable. 

8. Geiger Counter: Gets stronger as the distance to the goal is decreased. 
Also similar to the "specter detector" used in Scooby Doo to detect ghosts. 
Again, there could be multiple (radioactive?) sources disctributed in the 
environment. 

9. Speedometer: Measures the robot speed. Could also measure angular 
velocity. 

10. Clock: Measure the elapsed time from the initial state. 

11. Accelerometer: Measures only acceleration. Only relevant for problems 
that involve dynamics. 

12. Time-of-Flight: A unidirectional depth measurement. This is usually ob- 
tained from a sonar. 
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13. Range Scanner: Omnidirectional depth map (or limited direction range). 
Like the SICK laser. Also could characterize stereo vision. 

14. Gap Sensor: Gives orientations of discontinuties around S 1 . This is used 
in [?]. 

15. Landmarks: It is known that the robot is within a subset of the state space. 
Many variations are possible. A whole family of sensors can be obtained 
by placing static cameras or other sensors around the environment. These 
can detect the robot and possibly give configuration information when it is 
within the sensor's active range. 

16. Pebble: Like the one used in the old maze-searching papers. These can be 
dropped to mark places where you have been before. Part of the state space 
might encode the positions of these. 

17. Joint Encoders: Measures the position of a single manipulator joint. 

18. Force Sensor: Like the contact sensor, but provides the direction and/or 
magnitude of the force. 



11.6 Examples for Continuous State Spaces 

11.6.1 Projection Sensors 

State space: X = {(£i,£2) G M 2 | x<i = smxi} 
Sensor space: Y = [—1, 1] cR 

Control model: f(x(t),u(t)) = u(t) and U{t) = {u(t) 6 M 2 | « G T(x(t)) and \\u 

1} 

Solutions to x = f(x(t), u(t)) yield continuous state trajectories for each choice 
of u. 

Sensor model: h(x(t)) = X2 

Information state: r) t = y t (input history is assumed unknown) 




Y X 



Tracking the Information State 
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Imagine observing some y t ... 




Y X 

The derived information state with initial condition X = X. 



Assume X = x(0), some particular, given initial state. 




Y X 

Bifurcations occur when after passing through sensor readings of 1 or —1. 
What is the topology of the derived information space? 
Traversing a Graph 

X is a planar graph (connected, 1-dimensional CW complex) embedded in M 2 : 




Consider a Simple Example 
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X 



Y 




How is the derived information space connected? 




A Little More General 




I 



These models are too contrived for real robotics applications. 
However, higher-dimensional generalizations are quite relevant. 

11.6.2 Sensorless Manipulation 

Vt = ih 

A motion strategy is specified by a prescribed input history u. 
Sensorless Manipulation (Erdmann, Mason, 1986; Mason, Goldberg, 1990; 
Akella, Huang, Lynch, Mason, 1997) 

Example 11.6.1 See Figure 11.6. 
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O 



Figure 11.6: Tilt the tray to roll the ball into the desired corner. 

Imagine a ball with unknown initial position rolling in a tray. 

Find a sequence of tray tilts that places the ball in a known location. 

Think about nondeterministic derived information states. 

(Example inspired by Mason, Erdmann, 1988 - polygonal part orienting) 

Example 11.6.2 (Orienting Parts) See Figure ??. 
Mechanical compliance reduces uncertainty. 
Initiallly, orientation of a planar part is unknown. 
Find a sequence of squeezes that enables the orientation to be known. 
(Mason and Goldberg, 1990) 

11.6.3 Environment Spaces 
11.7 State Estimation 

Need big warnings about how this is classical, but generally not needed in many 
circumstances. In some sense, estimation defeats the purpose of reasonig about 
information spaces. 

11.7.1 Mapping Histories to States 

11.7.2 Kalman Filtering 

Linear Gaussian: information space collapses to mean and covariance. 
X = U = W = V = R n . 
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Figure 11.7: A part can be oriented without sensing by performing squeezing 
operations. 

w and v are defined by sampling from a zero- mean Gaussian i.i.d. sequence of 
random variables on W and V, respectively. 

x(k + 1) = Ax(k) + Bu(k) + v(k) 

y {k) = Cx(k) +w(k) 
A, B, and C are n x n matrices with full rank. 

If the initial probability measure over X is Gaussian, then all possible derived 
information states will be Gaussian! 
The continuous-time case is similar. 

This means the information space can be parameterized by mean and covari- 
ance. 

The information state computations are called the Kalman filter. 

11.8 Multiple Decision Makers 

11.8.1 Information Spaces for Everyone 

Common state space, but one information space per decision maker. 
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Strategies 

Mention for which conditions Nash equilbiria exist, etc. 

11.8.2 Extended Form Games 

11.8.3 Examples 

Give battleship game-like example. 

Team theory? Limited communication, but a common goal. 
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Literature 

Information spaces - where have they come from? 

• Stochastic control theory 

Due to disturbances in preelection and measurements, there is imperfect 
state information. 

• Differential/dynamic game theory 

Modeling unknown state information that results from the choices made by 
other players. 

• Robotics 

Uncertainty in configuration or state due to sensing limitations. 
Related robotics work: 

• Preimage Planning (Lozano-Perez, Mason, Taylor, Erdmann 1984) 

• Error Detection and Recovery (Donald, 1987) 

• Sensorless Manipulation (Erdmann, Mason, 1986; Mason, Goldberg, 1990; 
Akella, Huang, Lynch, Mason, 1997) 

• Perceptual Kinematic Maps (Herve, Cucka, Sharma, 1990) 

• Perceptual Equivalence Classes and Information Invariants (Donald, Jen- 
nings, 1991; Donald, 1995) 

• Pursuit-Evasion (Parsons, 1977; Suzuki, Yamashita, 1992; LaValle, Lin, 
Guibas, Latombe, Motwani, 1997; Simov, Slutzki, LaValle, 2000) 

• Probabilistic Robot Navigation (Simmons, Koenig, 1995) 

• Bayesian Localization (Thrun, 1998) 

Stochastic control. 
POMDPs in AI. 

Manipulation planning literature in motion planning. 

Information invariants (Donald) 

Pebbles and mazes (Blum, Kozen, related papers) 

Exercises 

1. Derive forward and backwards projections for the discrete case, (this will be 
several exercises, depending on the different cases; some hints will be given 
too) 
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2. Need a simple example that involves showing that part of the info space is 
not reacheable. Also, it can be unidirectional. 

3. At the end of Section 11.3.2, it is mentioned that an equivalent DFA can be 
constructed from an NFA. 

(a) Give an explicit DFA that accepts the same set of strings as the NFA 
in Figure 11. 5. b. 

(b) Express the problem of determining whether the NFA in Figure 11.5.b 
accepts any strings as a planning problem using Formulation 2.2.1. 

4. A problem that generalizes Figure 11. 4 to a "plus" or "square" shape. 

5. Show that the information space is not connected for Example 11.4. Give 
an example of an information state that cannot be reached from the initial 
information state. Can you characterize all of the connected components? 
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Chapter 12 



Planning in the Information 
Space 



Chapter Status 



A 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



Cover this somewhere: 

S. Blind, C. McCullough, S. Akella, and J. Ponce, "Manipulating Parts with 
an Array of Pins: A Method and a Machine," International Journal of Robotics 
Research, Vol. 20, No. 10, pp. 808-818, October 2001. 

S. Akella, W. H. Huang, K. M. Lynch, and M. T. Mason, "Parts Feeding on 
a Conveyor with a One Joint Robot," Algorithmica (Special issue on Robotics), 
Vol. 26, No. 3/4, pp. 313-344, March/April 2000. 

12.1 Information Spaces over Sets of Environ- 
ments 

12.1.1 Maze Searching 

Cover old Blum and Kozen-style maze searching. Really interesting stuff! 

Give some very simple mazes (e.g., 3x3), to clearly show the information 
spaces. 

Explain how building a perfect map explores the information space. 
Explain that BK are collapsing the information space. 

Give the BK exploration algorithm. Lower left corner markers, green-eyed 
automaton, etc. The automaton requires logarithmic space, which is much less 
than that required to hold a map! 
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12.1.2 Bug Algorithms 

This section addresses a motion strategy problem that deals with uncertainty with 
sensing. The Bug algorithms make the following assumptions: 

• The robot is a point in a 2D world. 

• The obstacles are unknown and nonconvex. 

• An initial and goal positions are defined. 

• The robot is equipped with a short-range sensor that can detect an obstacle 
boundary from a very short distance. This allows the robot to execute a 
trajectory that follows the obstacle boundary. 

• The robot has a sensor that allows it to always know the direction and 
Euclidean distance to the goal. 

Bug 1 This robot moves in the direction of the goal until an obstacle is 
encountered. A canonical direction is followed (clockwise) until the location of 
the initial encounter is reached. The robot then follows the boundary to reach 
the point along the boundary that is closest to the goal. At this location, the 
robot moves directly toward the goal. If another obstacle is encountered, the 
same procedure is applied. 




The worst case performance, L, is 

3 N 
L<d+-Y,P i 

i=i 

in which d is the Euclidean distance from the initial position to the goal position, 
Pi is the perimeter of the i th obstacle, and N is the number of obstacles. 
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Bug 2 In this algorithm, the robot always attempts to move along the line 
of sight toward the goal. If an obstacle is encountered, a canonical direction is 
followed until the line of sight is encountered. 



in which rii is the number of times the i obstacle crosses the line segment between 
the initial position and goal position. 

12.1.3 Gap Navigation Trees 

Optimal navigation can be performed with only minimal sensing information and 
no metric measurements. The representation corresponds to collapsed information 
states. 

12.2 Localization and Map Building 

This will be more estimation-oriented 

12.3 Manipulation with Minimal Information 

12.4 Visibility-Based Pursuit-Evasion 

This section addresses another motion strategy problem that deals with uncer- 
tainty in sensing. The model is: 




Goal 



The worst case performance, L, is 




i=i 
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• The robot is a point (the "pursuer") that moves in a 2D world that is 
bounded by a simple polygon (it is simply-connected). 

• The world contains a point "evader" that can move arbitrarily fast. 

• The task is to move the pursuer along a path that guarantees that the evader 
will eventually be seen using line-of-sight visibility in the 2D world. 

The problem can be formulated as a search in an information space, in which 
each information state is of the form (q, S). The information state represents the 
position of the pursuer, q, and the set, S, of places where the evader could be 
hiding. 

The key idea in developing a complete algorithm that will construct a solution 
if one exists is to partition the world into cells, such that inside of each cell there 
are no critical changes in information. 




Contaminated 



i 




Without crossing 
a critical boundary 



Crossing a critical boundary 




A finite graph search can be performed over these cells, cells might generally 
be visited multiple times. As the pursuer moves from cell to cell, the information 
state is maintained by maintaining binary labels on the gaps in visibility. 
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As the pursuer moves, gaps can generally split, merge, appear, or disappear, 
but within a cell, none of these changes occur. When a transition occurs from one 
cell to another, a simple transition rule specifies the new information state. 

Examples 

Even though there are slight variations in the environment from example to 
example, all of these can be solved, except for the last one. 




Each example below is labeled Tj, in which % is the number pursuers needed 
to solve the problem. 




Ti T 2 T 3 



l! I V 




T 4 



This example requires the peak to be visited k — 1 times for k pairs of "feet" . 
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12.5 
12.6 
12.7 



Preimage Planning 

Algorithms for Solving POMDPs 

Dynamic Programming on Information Spaces 



12. 7. DYNAMIC PROGRAMMING ON INFORMATION SPACES 

Literature 

POMDPs in AI. 

Exercises 

1. show how different bug algorithms explore some examples 

2. make the gap navigation tree for a given example 
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Part IV 

Planning Under Differential 
Constraints 



479 



Chapter 13 
Differential Models 



Chapter Status 



A 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



13.1 Motivation 



In the models and methods studied so far, it has been assumed that a path can 
easily between obtained between any two configurations if there are no collisions. 
For example, the randomized roadmap approach assumed that two nearby config- 
urations could be connected by a "straight line" in the configuration space. The 
constraints on the path are global in the sense that the restrictions are on the set 
of allowable configurations. 

For the next few chapters, local constraints will be introduced. One of the 
simplest examples is a car-like robot. Imagine a trying to automate the motions 
of a typical automobile that has a limited steering angle. Consider the difficulty 
of moving a car sideways, while the rear wheels are always pointing forward. It 
would certainly make parallel parking easy if it was possible to simply turn all four 
wheels toward the curb. The orientation limits of the wheels, however, prohibit 
this motion. At any configuration, there are constraints on the velocity of the car. 
In other words, it is permitted only to move along certain directions to ensure 
that the wheels roll. 

Although the motion is constrained in this way, most of us are experienced 
with making very complex driving maneuvers to parallel park a car. We would 
generally like to have algorithms that can maneuver a car-like robot and a variety 
of other nonholonomic systems while avoiding collisions. This will be the subject 
of nonholonomic planning. 
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13.2 Representing Differential Constraints 

Implicit velocity constraints 

Suppose that X represents an n-dimensional manifold that serves as the state 
space. Let x G X represent a state. It will often be the case that X = C; 
however, a state could include additional information. It will be assumed that 
X is differentiate at every point. To enable this formally, one must generally 
characterize the X by using multiple coordinate systems, each of which covers 
a subset of X [719]. We avoid these technicalities in the concepts that follow 
because they are not critical for understanding the material. 

Consider a moving point, x G X. Let x denote the velocity vector, 

dx\ dx2 dx n 
dt dt dt 

Let Xi denote dxi/dt. At most places in this chapter where differentiation occurs, 
it can be imagined that X = M. n . Recall that any manifold of interest can be 
considered as a rectangular region in R n with identification of some boundaries. 
Multiple coordinate systems are generally used to ensure differentiability proper- 
ties across these identifications. Imagining that X = W 1 will be reasonable except 
at the identification points. For example, if X = M 2 x S 1 , then special care must 
be given if 9 = 0. Motions in the negative 9 direction will actually cause 9 to 
increase because of the identification. 

Suppose that a classical path planning problem has been defined, resulting in 
X — C, and that a collision-free path, r has been computed. Recall that r was 
defined as r : [0, 1] — > Cf ree . Although it did not matter before, suppose now that 
[0, 1] represents an interval of time. At time t — the state is x = qmit, and at 
time t — 1, the state is x = q goa i- The velocity vector is x — dr/dt. 

Up to now, there have been no constraints placed on x, which means that 
any velocity vector is possible. Suppose that the velocity magnitude is bounded, 
||i;|| < 1. Does this make the classical path planning problem more difficult? 
It does not because any path r : [0, 1] — > C/ ree can be converted into another 
path, t' which satisfies the bound by lengthening the time interval. For example, 
suppose s denotes the maximum speed (velocity magnitude) along r. A new path, 
t' : [0, s] — > Cf ree , can be defined by r'(t) = r(t/s). For r', the velocity will be 
bounded by one for all time. 

Suppose now that a constraint such as x\ < is added. This implies that 
for any path, the variable X\ must be monotonically nonincreasing. For example, 
consider path planning for a rigid robot in the plane, yielding X = M 2 x S 1 . 
Suppose that constraint 9 < is imposed. This implies that the robot is only 
capable of clockwise rotations! 

In general, we allow constraints of the implicit form hi(x, x) — to be imposed. 
Thus, the constrained velocity can depend on the state, x. Inequality constraints 
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of the form hi(x,x) < and hi(x,x) < are also permitted. Each constraint 
restricts the set of allowable velocities at any state x G X. 

The state transition equation 

Although the implicit constraints are general, it is often difficult to work directly 
with them. A similar difficulty exists with plotting solutions to an implicit function 
of the form f(x,y) = 0, in comparison to plotting the function y = f(x). It 
turns out for our problem that the implicit constraints can be converted into a 
convenient form if it is possible to solve for x. 1 This will yield a direct expression 
for the set of allowable velocities. 

For example, suppose I = I 2 x 5 11 and let (x, y, 9) denote a state. Consider 
the constraints 2x — y = and 9 — 1 < 0. 2 By simple manipulation, we can write 
x = \y. What should be done with y and 91 It turns out that new variables 
need to be introduced to parameterize the set of solutions. This occurs because 
the set of implicit equations is generally underconstrained (i.e., there is an infinite 
number of solutions). By introducing Ui G K. and u 2 G K, we can write y = U\ and 
9 = U2 such that ui < 1. The restriction on w 2 comes from the implicit equation 
9 — 1 < 0. Note that there is no restriction on u\. 

By solving for x and introducing extra variables, the resulting form can be con- 
sidered as a control system representation in which the extra variables represent 
inputs. The input is selected by the user, and could correspond, for example, to the 
steering angle of a car. Suppose / is a vector- valued function, / : X xU — > R n , in 
which X is an n-dimensional state space, and U is an m-dimensional input space. 

The state transition equation indicates how the state will change over time, 
given a current state and current input. 

x = f(x,u). (13.1) 

For a given state, x G X and a given input u G U, the state transition equation 
yields a velocity. Simple examples of the state transition equation will be given 
in Section 13.3. 

Two different representations of differential constraints have been introduced. 
The implicit form is the most general; however, it is difficult to use in many 
cases. The state transition equation represents a parametric form that directly 
characterizes the set of allowable velocities at every point in X. The parametric 
form is also useful for numerical integration, which enables the construction of an 
incremental simulator. 



1 Jacobian-based conditions for this are given by the implicit function theorem in calculus. 
2 Be careful of notation collision. A general state vector is denoted as x; however for some 
particular instances, we also use the standard (x, y) to denote a point in the plane. 
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An Incremental Simulator 

By performing integration over time, the state transition equation can be used to 
determine the state after some fixed amount of time, At has passed. For example, 
if we know x(t) and inputs u(t') over the interval t' G [t, t + At], then the state, 
x(t + At) can be determined as 

-t+Ai 



f(x(t'),u(t'))dt' 



The integral above cannot be evaluated directly because x(t') appears in the in- 
tegrand, but is unknown for time t' > t. 

Several numerical techniques exist for numerically approximating the solution. 
Using the fact that 

, . dx Ax x(t + At)-x(t) 

f(x,u) = X = — PS — - = — t 1 — , 

Jy ' ' dt At At 

one can solve for x(t + At) to yield the classic Euler integration method, 

x(t + At) ps x(t) + At f(x(t),u(t)). 

For many applications, too much numerical error introduced by Euler integra- 
tion. Runge-Kutta integration provides an improvement that is based on higher- 
order Taylor series expansion of the solution. One useful form of Runga-Kutta 
integration is the fourth-order approximation, 

X(t + At) PS X (t) + Y^ Wl + 2V ° 2 + 2U?3 + W4 ^ 

in which 

wi = f(x(t),u(t)), 



w 2 = f( x (t) + — w u u(t)), 
w 3 = f(x(t) + ^ w 2 ,u(t)), 

and 

w 4 = f(x(t) + At w 3 ,u(t)). 

For some problems, a state transition equation might not be available; however, 
it is still possible to compute any future state, given a current state and an input. 
This might occur, for example, in a complex software system that simulates the 
dynamics of a automobile, or a collection of parts that bounce around on a table. 
In this situation, we simply define the existence of an incremental simulator, which 
serves as a "black box" that produces a future state, given any current state and 
input. Euler and Runge-Kutta integration may be viewed as techniques that 
convert a state transition equation into an incremental simulator. 
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13.3 Kinematics for Wheeled Systems 

Several interesting state transition equations can be defined to model the motions 
of objects that move by rolling wheels. For all of these examples, the state space, 
X, is equivalent to the configuration space, C. 

13.3.1 A Simple Car 

A simple example is the car-like robot. It is assumed that the car can translate 
and rotate, resulting in C = M 2 x S 1 . Assume that the state space is defined 
as X = C. For convenience, let each state be denoted by (x,y,6). Let s and 
denote two scalar inputs, which represent the speed of the car and the steering 
angle, respectively. The picture below indicates several parameters associated 
with the car. 




The distance between the front and rear axles is represented as L. The steering 
angle is denoted by 0. The configuration is given by (x,y,9). When the steering 
angle is 0, the car will roll in a circular motion, in which the radius of the circle is 
p. Note that p can be determined from the intersection of the two axes as shown 
(the angle between these axes is 0). 

The task is to represent the motion of the car as a set of equations of the form 

i = fi(x,y,6,s,(f>) 
V = f 2 (x,y,6,s,(f)) 
= f 3 (x,y,6,s,(f>). 

In a small time interval, the car must move in the direction that the rear wheels 
are pointing. This implies that ^ = tan 9. Since ^ = % and tan# = ^4, this 

r ° r dx dx x cos & ' 

motion constraint can be written as an implicit constraint: 



—x sin 6 + y cos 6 = 0. 



(13.2) 
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The equation above is satisfied if x = cos 9 and y = sin#. Furthermore, any scalar 
multiple of this solution is also a solution, which corresponds directly to the speed 
of the car. Thus, the first two scalar components of the state transition equation 
are x = s cos 9 and y = s sin 9. 

The next task is to derive the equation for 9. Let p denote the distance 
traveled by the car. Then p — s, which is the speed. As shown in the figure 
above, p represents the radius of a circle that will be traversed by the center of 
the rear axle, when the steering angle is fixed. Note that dp = pd9. From simple 
trigonometry, p = which implies 

d9 = ^dp. 
Dividing by dt and using the fact that p = s yields 

9 = — tan 4>. 

Thus, the state transition equation for the car-like robot is 

/ s cos 9 




Most vehicles with steering have a limited steering angle, 4> ma x such that < 



< - 

^ 2 



The speed of the car is usually bounded. If there are only two possible speeds 
(forward or reverse), s G {—1, 1}, then the model is referred to as the Reeds-Shepp 
car [648, 730]. If the only possible speed is s = 1, then the model is referred to as 
the Dubins car [214]. 



13.3.2 A Continuous-Steering Car 



In the previous model, the steering angle, 0, was an input, which implies that one 
can instantaneously move the front wheels. In many applications, this assumption 
is unrealistic. In the path traced out in the plane by the center of the rear axle 
of the car, there is a curvature discontinuity will occur when the steering angle is 
changed discontinuously. To make a car model that only generates smooth paths, 
the steering angle can be added as a state variable. The input is the angular 
velocity, u, of the steering angle. 

The result is a four-dimensional state space, in which each state is represented 
as (x, y, 0, 9). This yields the following state transition equation: 

/ scos# \ 
ssin^ 



(x\ 

y 



W 



\L 



tan 0, 



in which there are two inputs, s and u>. This model was considered in [673]. 
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13.3.3 A Car Pulling Trailers 

The continuous-steering car can be extended to allow one or more single-axle trail- 
ers to be pulled. For k trailers, the state is represented as (x, y, 0, 9 , 9 1: . . . , 9 k ). 



(*,y) 



The state transition equation is 

/ 

y 

^0 

W 



s 
di 



'i-l 



scos9 
ssin# 

s , 
— tan</> 



J Jcos(6' i _i - Oj) J sin(^_i - Q { 



\ 



I 



in which 8q is the orientation of the car, 9i is the orientation of the i th trailer, and 
di is the distance from the i th trailer wheel axle to the hitch point. This model 
was considered in [573]. 



13.3.4 A Differential Drive 

The differential drive model is very common in mobile robotics. It consists of a 
single axle, which connects two independently-controlled wheels. Each wheel is 
driven by its own motor, and it free to rotate without affecting the other wheel. 
Each state is represented as (x,y,9). The state transition equation is 

x\ (\{ui + « r )cos6>\ 

V = §(«i + « r )sin# , (13.3) 

9 J V 5K-«0 / 
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in which r is the wheel radius, £ is the axle length, u r is the angular velocity of 
the right wheel, and ui is the angular velocity of the left wheel. 





V 




1 — ' 


— 1 















[ 



If ui — u r — 1, the differential drive rolls forward. If u\ = u r = —1, the 
differential drive rolls in the opposite direction. If u\ = —u r , the differential drive 
performs a rotation. 

13.4 Rigid-Body Dynamics 

so far, this is only a point mass... 

For problems that involve dynamics, constraints will exist on accelerations, in 
addition to velocities and configurations. Accelerations may appear problematic 
because they represent second-order derivatives, which cannot appear in the state 
transition equation (13.1). To overcome this problem a state space will be defined 
that allows the equations of motion to be converted into the form x = f(x,u). 
Usually, the dimension of this state space is twice the dimension of the configura- 
tion space. 

The state space For a broad class of problems, equations of motion that involve 
dynamics can be expressed as q = g(q, q), for some measurable function g. Suppose 
a problem is defined on an n-dimensional configuration space, C. Define a 2n- 
dimensional state vector x — [q q] . In other words, x represents both configuration 
and velocity, 

x = [qi q 2 ■ • -q n Qi <?2 • • • q n ]- 

Let X denote the 2n-dimensional state space, which is the set of all state vectors. 

The goal is to construct a state transition equation of the form x = f(x,u). 
Given the definition of the state vector, note that Xi = x n+i if i < n. This 
immediately defines half of the components of the state transition equation. The 
other half is defined using q = g(q,q). This is obtained by simply substituting 
each of the q, q, and q variables by their state space equivalents. 

Example: Lunar lander A simple example that illustrates the concepts is 
given. The same principles can be applied to obtain equations of motion of the 
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form x = f(x,y) for state spaces that represent the configuration and velocity of 
rigid and articulated bodies. 



The lander is modeled as a point with mass, m, in a 2D world. It is not 
allowed to rotate, implying that C = M 2 . There are three thrusters on the lander: 
Thruster One (right side), Thruster Two (bottom), and Thruster Three (left side). 
The activation of each thruster is considered as a binary switch. Let Ui denote 
a binary- valued action that can activate the i th thruster. If Ui — 1, the thruster 
fires, if Ui = 0, then the thruster is dormant. Each of the two lateral thrusters 
provides a force F s when activated. The upward thruster, mounted to the bottom 
of the lander, provides a force F u when activated. Let g denote the acceleration 
of gravity. 

From simple Newtonian mechanics, Yl F = ma, in which ^ F denotes the 
vector sum of the forces, m denotes the mass of the lander, and a denote the 
acceleration, q. The gi-component (x-direction) yields 



The constraints above can be written in the form f(q,q,q) = (actually, the 
equations are simple enough to obtain f(q) = 0). 

The lunar lander model can be transformed into a four-dimensional state space 
in which x = [qi q 2 qi q?\. By replacing q\ and q 2 with x 3 and £4, respectively, 
the Newtonian equations of motion can be written as 




mq\ =u 1 F s - u 3 F s 
and the ^-component (y-direction) yields 

mq 2 = u 2 F u - mg 



X3 = — 
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u 2 F u 
x A = g 



m 



Since X\ = x 3 and x 2 = x 4 , the state transition equation becomes 

/ x 3 \ 



x 2 



\ 



X 4 



-9 ) 



which is in the desired form x = f(x,u). 



13.5 Multiple- Body Dynamics 

13.6 More Examples 

This section includes other examples of state transition equations. 

The nonholonomic integrator Here is a simple nonholonomic system that 
might be useful for experimentation. Let X = M 3 , and let the set of inputs, 
U = M 2 . The state transition equation for the nonholonomic integrator is 



X1 \ 




1 * \ 


X2 










\XiU 2 - X 2 UiJ 



Chapter 14 

Nonholonomic System Theory 



Chapter Status 


A 




What does this mean? Check 
, http:/ /msl. cs.uiuc.edu/planning/status. html 
^ for information on the latest version. 



This chapter deals with the analysis of problems that involve differential con- 
straints. One fundamental result is the Frobenius theorem, which allows one to 
determine whether the state transition equation represents a system is actually 
nonholonomic. In some cases, it may be possible to integrate the state transition 
equation, resulting in a problem that can be described without differential models. 
Another result is Chow's theorem, which indicates whether a system is control- 
lable. Intuitively, this means that the differential constraints can be completely 
overcome by generating arbitrarily short maneuvers. The car-like robot enjoys the 
controllability property, which enables it to move itself sideways by performing 
parallel parking maneuvers. 

14.1 Vector Fields and Distributions 

A special form of the state transition equation Most of the concepts in this 
chapter are developed under the assumption that the state transition equation, 
x = f(x,u) has the following form: 

x = a l (x)ui + a 2 (x)u2 + • • ■ + a m (x)u m , (14-1) 

in which each a l (x) is a vector- valued function of x, and m is the dimension of 
U (or the number of inputs). The a 1 functions can also be arranged in an n x m 
matrix, 

A(x) = [a\x) a 2 (x) ■■■ a m (x)}. (14.2) 
It will usually be assumed that m < n. 
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In this case, the state transition equation can be expressed as 

x = A(x)u. (14.3) 

For the rest of the chapter, it will be assumed that the matrix A(x) is nonsingular. 
In other words, the rows of A(x) are linearly independent for all x. To determine 
whether A{x) is nonsingular, one must find at least one mxm cofactor (or signed 
submatrix) of A(x) which has a nonzero determinant. 

— * 

Vector fields A vector field, V, on a manifold X, is a function that associates 

— # 

with each x G X, a vector, V(x). The velocity field is a special vector field 
that will be used extensively. Each vector V(x) in a velocity field represents the 
infinitesimal change in state with respect to time, 

dx\ dx2 dx. 
dt dt dt 

evaluated at the point 

Note that for a fixed u, any state transition equation, x = f(x,u) defines a 
vector field because x is expressed as a function of x. 

Distributions Each input u EU can be used to define a vector field. It will be 
convenient to define the set of all vector fields that can be generated using inputs. 
Assume that a state transition equation of the form in (14.1) is given for a state 
space X, and an input space U = R m . The set of all vector fields that can be 
generated using inputs u e U is called the distribution, and is denoted by A (AT) 
or A. 

The distribution can be considered as a vector space. Note that each a 1 can 
be interpreted as a vector field. Any vector field in A can be expressed as a 
linear combination of the a 1 functions, which serve as a basis of the vector space. 
Consider the effect of inputs of the form 

[1 • • • • • • 0] 
[0 10 • • • • • • 0] 



(14.4) 



[0 • • • 10 



0] 
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[0 ••• ••• 1]. 

If Ui — 1, and Uj = for j ^ i, then the state transition equation yields x = a l (x). 
Thus, each input in this form can be used to generate a basis vector field. The 
dimension of the distribution the number of vector fields in its basis (in other 
words, the maximum number of linearly-independent vector fields that can be 
generated) . 

In terms of basis vector fields, a distribution is expressed as 

A = span{a l (x), a 2 (x), . . . , a n (x)} (14.5) 



Example: Differential Drive The state transition equation (13.3) for the 
differential drive can be expressed in the form of (14.1) as follows: 



2 cos 9)u t + cos 9)u r \ I 1 cos 9 |cos# 
2 s\n6)ui + (\sm9)u r = ( §sin0 § sin 6> | ( ) = A(x,y,9) 



,-- e m + {-Mr 




(14.6) 

The matrix A(x, y,9) is nonsingular because any of the three 2x2 cofactors 
of A(x, y, 9) has a nonzero determinant for all states. 

To simplify the characterization of the distribution, a linear transformation 
will be performed on the inputs. Let U\ = ui + u r and «2 — u r — ui. Intuitively, U\ 
means "go straight" and u 2 means "rotate" . Note that the original ui and u r can 
be easily recovered from u\ and u 2 . For additional simplicity, assume that £ = 2 
and r = 2. The state transition equation becomes 



;i4.7) 



— * 

Using input u — [1 0], the vector field V = [cos 9 sin 9 0] is obtained. Using 
u — [0 1], the vector field W = [0 1] is obtained. Any other vector field that 

— * 

can be generated using inputs can be constructed as a linear combination of V 
and W. The distribution A has dimension two, and is expressed as span{V, W}. 




14.2 The Lie Bracket 

The Lie bracket attempts to generate velocities that are not directly permitted 
by the state transition equation. For the car-like robot, it will produce a vector 
field that can move the car sideways (it is achieved through combinations of vector 
fields, and therefore does not violate the nonholonomic constraint). This operation 
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is called the Lie bracket (pronounced as "Lee"), and for given vector fields V and 
W, it is denote by [V, W\. The Lie bracket is computed by 



[V, W] = DW -V-DV-W 
in which • denotes a matrix-vector multiplication, 



(14.8) 



DV = 



and 



DW = 





O Vi 






dxi 


8x2 


dx n 




ov 2 


a vi 


dV 2 




dx\ 


dx 2 


dx n 


(14.9) 


OV n 




dV n 




\dxi 


0X2 


dx n j 




/dWi 


dW l 


dWA 




dxi 


dx 2 


dx n 




dW 2 


dW 2 


0W 2 




dxi 


dx 2 


dx n 


(14.10) 


dW n 


dW n 


dW n 




\ <9xi 


dx 2 


dx n 


/ 





In the expressions above, Vi and Wi denote the i th components of V and W, 
respectively. 

To compute the Lie bracket it is often convenient to directly use the expression 
for each component of the new vector field. This is obtained by performing the 
multiplication indicated above. The i th component of the Lie bracket is given by 



E 



(14.11) 



Two well-known properties of the Lie bracket are: 

— » — » — » — » — * — * 

1. (skew-symmetry) [V, W] = —[W, V] for any two vector fields, V and W 

2. (Jacobi identity) [[V, W], U] + [[W, U],V] + [[U, V], W] = 

It can be shown using Taylor series expansions that the Lie bracket [V, W\ can 
be approximated by performing a sequence of four integrations. From a point, 
x G X, the Lie bracket yields a motion in the direction obtained after performing 
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Figure 14.1: The velocity obtained by the Lie bracket can be approximated as a 
sequence of four motions. 

1. Motion along V for time At/4 

2. Motion along W for time At/4 

3. Motion along —V for time At/4 

4. Motion along —W for time At/4 

The direction from x to the resulting state after performing the four motions 
represents the direction given by the Lie bracket, as shown in Figure 14.1 by the 
dashed arrow. 

14.3 Integrability and Controllability 

The Lie bracket can be used to generate vector fields that potentially lie outside 
of A. There are two theorems that express useful system properties that can be 
inferred using the vector fields generated by Lie brackets. 

The Control Lie Algebra (CLA) For a given state transition equation of the 
form (14.1), consider the set of all vector fields that can be generated by taking Lie 
brackets, [a l (x), a^(x), of vector fields a l (x) and a^(x) for % ^ j. Next, consider 
taking Lie brackets of the new vector fields with each other, and with the original 
vector fields. This process can be repeated indefinitely by iteratively applying 
the Lie bracket operations to new vector fields. The resulting set of vector fields 
can be considered as a kind of algebraic closure with respect to the Lie bracket 
operation. Let the control Lie algebra, CLA(A), denote the set of all vector fields 
that are obtained by this process. 

In general, CLA(A) can be considered as a vector space, in which the basis 
elements are the vector fields a 1 (re), . . . , a m (x), and all new, linearly-independent 
vector fields that were generated from the Lie bracket operations. 
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Finding the basis of CLA(A) is generally a tedious process. There are several 
systematic approaches for generating the basis, of which one of the most common 
is called the Phillip-Hall basis. This basis automatically eliminates any vector 
fields from the Lie bracket calculations that could be obtained by skew symmetry 
or the Jacobi identity. 

Each Lie bracket has the opportunity to generate a vector field that is linearly- 
independent; however, it is not guaranteed to generate one. In fact, all Lie bracket 
operations may fail to generate a vector field that is independent of the original 
vector fields. Consider for example, the case in which the original vector fields, 
a\ are all constant. All Lie brackets will be zero. 

Integrability In some cases, it is possible that the differential constraints are 
integrable. This implies that is can be expressed purely as a function of x and u, 
and not of x. In the case of an integrable state transition equation, the motions is 
actually restricted to a lower-dimensional subset of X, which is a global constraint 
as opposed to a local constraint. 

As a simple example, suppose that X = M, 2 , and a state transition equation is: 



Suppose that an initial state (a, 0) is given for some a G (0, oo). By selecting 
an input U\ G (oo, 0), integration of the state transition equation over time will 
yield a counterclockwise path along a circle of radius a, centered at the origin. If 
u\ < 0, then a clockwise motion along the circle is generated. Note that starting 
from any initial state, there is no way for the state, (x(t),y(t)) to leave a circle 
centered at the origin. Thus, the state transition equation simply represents a 
global constraint that the set of states is constraints to a circle. 

In general, if is very difficult to determine whether a given state transition 
can be integrated to remove all differential constraints. The Frobenius theorem 
gives an interesting condition that may be applied to determine whether the state 
transition equation is integrable. 

Theorem 1 (Frobenius) The state transition equation is integrable if and only if 
all vectors fields that can be obtained by Lie bracket operations are contained in 



Intuitively, if the Lie bracket operation is unable to produce any new (linearly- 
independent) vector fields that lie outside of A, then the state transition equation 
can be integrated. Thus, the equation is not needed, and the problem can be 
reformulated without using x. This is, however, a theoretical result; it may be a 
difficult or impossible task in general to integrate the state transition equation in 
practice. 




(14.12) 



A. 
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The Frobenius theorem can also be expressed in terms of dimensions. If 
dim(CLA(A)) = dim(A), then the state transition equation is integrable. Note 
that the dimension of CLA(A) can never be greater than n. 

If the state transition equation is not integrable, then it is called nonholonomic. 
These equations are of greatest interest. 



Controllability In addition to integrability, another important property of a 
state transition equation is controllability. Intuitively, controllability implies that 
the robot is able to overcome its differential constraints by using Lie brackets 
to compose new motions. The controllability concepts assume that there are no 
obstacles. 

Two kinds of controllability will be considered. A point, x', is reachable from 
x, if there exists an input that can be applied to bring the state from x to x'. Let 
R(x) denote the set of all points reachable from x. A system is locally controllable 
if for all x G X, R(x) contains an open set that contains x. This implies that any 
state can be reached from any other state. 

Let R(x, At) denote the set of all points reachable in time At. A system is 
small-time controllable if for all x G X and any At, then R(x, At) contains an 
open set that contains x. 

The Dubins car is an example of a system that is locally controllable, but not 
small-time controllable. If there are no obstacles, it is possible to bring the car to 
any desired configuration from any initial configuration. This implies that the car 
is locally controllable. Suppose one would like to move the car to a position that 
would be obtained by the Reeds-Shepp car by moving a small amount in reverse. 
Because the Dubins car must drive forward to reach this configuration, it could 
require time larger than some small At. Hence, the Dubins care is not small-time 
controllable. 

However, a substantial amount of time might be required to drive the care 
Chow's theorem is used to determine small-time controllability. 

Theorem 2 (Chow) A system is small-time controllable if and only if the dimen- 
sion of CLA(A) is n, the dimension of X . 



Example of integrability and controllability As an example of controlla- 
bility and integrability, recall the differential drive model. From the differential 
drive example in Section 14.1, the original vector fields are a l (x) = [cosd sin6 0] 
and a 2 (x) = [001]. 

Let V denote a 1 (a;), and let W denote a 2 (x). To determine integrability and 
controllability, the first step is to compute the Lie bracket, Z = [V,W]. The 
components are 

* = «tf - w >w+ v >w - w 'w + v >mr ~ w 'w = (1 " 3) 
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dWo dVo dWo dVo dWo dV> 
Z 2 ^V 1 P-W 1 P+V 2 P-W 2 °P + V 3 °^ - W 3 °^ = - cos 9, (14.14) 
Ox ox oy oy 00 09 

and 

* = - W ^ + V >¥ ~ + V,§ - W 3 § = 0. (14.15) 

ot ay ay at' at' 

— * 

The resulting vector field is Z = [sin — cos 9 0] . 

— * — * — * 

We immediately observe that Z is linear independent from V and W. This 
can be seen by noting that the determinant of the matrix 

'cos 9 sin 9 0\ 

01 (14.16) 
^sin^ — cos0 Oy 

in nonzero for all (x, y, 9). This implies that the dimension of CLA(A) = 3. Using 
the Frobenius theorem, it can be inferred that the state transition equation is not 
integrable, and the system is nonholonomic. From Chow's theorem, it is known 
that the system is small-time controllable. 

A nice interpretation of the result can be constructed by using the motions 
depicted in Figure 14.1. Suppose the initial state is (0,0,0). The Lie bracket at 
this state is [0 — 1 0], which can be constructed by four motions: 1) apply 
Ui, which translates the drive along the X axis; 2) apply u 2 , which rotates the 
drive counterclockwise; 3) apply —u±, which translates the drive back towards 
the Y axis, but the motion is at a downward angle due to the rotation; 4) apply 
— u 2 , which rotates the drive back into its original orientation. The net effect of 
these four motions moves the differential drive downard along the Y axis, which 
is precisely the direction [0 — 1 0] given by the Lie bracket! 



Chapter 15 



Planning Under Differential 
Constraints 



Chapter Status 



A 



What does this mean? Check 

http: / / msl.cs.uiuc.edu/planning/ status.html 

for information on the latest version. 



This chapter presents several alternative planning methods. For each method, 
it is assumed that a state transition equation or incremental simulator has been 
defined over a state space. The state could represent configuration or both con- 
figuration and velocity. 

15.1 Problem formulations 

Nonholonomic planning 
Kinodynamic planning 
Brief summary of complexity analysis 



CBHD formulas, flat systems, etc. 
15.2.1 Geodesic curve families 

Need to include Balkcom-Mason curves, Reeds- Shepp, Dubins, etc. 

A common theme for many planning approaches is to divide the problem into 
two phases. In the first phase, a holonomic planning method is used by pro- 
ducing a collision-free path that ignores the nonholonomic constraints. In the 
second phase, an iterative method attempts to replace portions of the holonomic 
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path with portions that satisfy the nonholonomic constraints, yet still avoid ob- 
stacles. In general, this will lead to an incomplete algorithm because there is no 
guarantee that the original path provides a correct starting point for obtaining a 
nonholonomic solution. However, it typically leads to a fast planning algorithm. 

In this section, we describe this approach for the case of a car-like robot. 
Assume that a fast holonomic planning method has been selected for the first 
phase. Suppose that a path, r : [0, 1] — > C/ ree has been computed. The path 
can be iteratively improved as follows. Randomly select two real numbers ol\ e 
[0,1] and a 2 £ [0,1]- Assuming a 2 > ol\ (if not, then swap them), attempt to 
replace the portion of r from t{ol\) to r(a 2 ) with a path segment that satisfies 
the nonholonomic constraints. This implies that r is broken into three segments, 
Ti : [0, cti] -> Cf ree , r 2 : [a 1 ,a 2 ] -> C/ree, and t 3 : [a 2 , 1] -> C /ree . Note that 
Ti(tti) = T2(q!i) and r 2 («2) = 73(0:2) ■ The portions ri and r 3 remain fixed, but r 2 
is replaced with a new path, r' : 0:2] — > Cfree, that satisfies the nonholonomic 
constraints. Note that r' must also avoid collisions, t'(cxi) = 7"i(ai), and r'(a2) — 
73 (0-2) • This procedure can be iterated multiple times until eventually, the original 
path is completely transformed into a nonholonomic path. Note that a\ — and 
a 2 = 1 must each have nonzero probability of being chosen in each iteration. In 
many iterations, the path substitution will fail; in this case, the previous path is 
retained. 

To make this and related approaches succeed, a fast technique is needed that 
constructs a nonholonomic path between any two configurations. Although this 
might appear as difficult as the original nonholonomic planning problem, it is 
assumed that the obstacles are ignored. In general, this is referred to as the 
steering problem, which as received a considerable amount of attention in recent 
years, particularly for car-like robots that pull trailers. For the case of a simple 
car-like robot with a limited steering angle, there are some analytical solutions 
to the problem of finding the shortest path between two configurations. In 1957, 
Dubins showed that for a car that can only go forward, the optimal path will take 
one of the six following forms: 

{LRL, LSL, LSR, RLR, RSR, RSL}. 

Each sequence of labels indicates the type of path. For example, "LRL" indicates 
a path that consists of a sharp left turn, immediately followed by a sharp right 
turn, immediately followed by a sharp left turn. Above, "S" denotes a straight 
segment. For a given pair of configurations, one can simply attempt to connect 
them using all six path types. The one with the shortest path length among the 
six choices is known to be the minimum-length path out of all possible paths. 
This path provides a nice substitution for t 2 , as described above. 

For the case of a car-like robot that can move forward or backwards, Reeds 
and Shepp showed in 1990 that the optimal path between two configurations will 
take one of 48 different forms. Although this situation is more complicated, the 
same general strategy can be applied as for the case of a forward-only car. 



15.3. SAMPLING-BASED PLANNING METHODS 



501 



15.2.2 Series Methods 

15.3 Sampling-Based Planning Methods 

Currently much of this section is pasted from a recent paper by Peng Cheng and 
Steve LaValle 

Problem V is transformed into a multistage decision problem, which is called a 
discretized problem V . At each stage there is a simpler motion planning problem, 
which is solved by a local planner. After the discretization, we expect that in 
most cases, an exact solution to V will no longer exist. Therefore, we assume that 
when V is given, a solution tolerance, e s G [0, oo), is specified. 

Note that: 1) control set U(x) could be state dependent, which means that 
sets of available controls for different states are different; 2) since controls are 
designed by the local planner, control set U also depends on the local planner. 
For example, if the local planner can only return piecewise constant controls, U 
is only a subset of the control space. In some sense, the local planner introduces 
a discretization on the control space of V . 

The discretization process partitions the time line into intervals. Any control 
u G U is applied over some time interval. Let i : U — > [0, oo) give the duration of 
any u G U . A control u G U is defined as a piecewise continuous function from 
[0, i(u)} into U; thus, it is not limited to a constant control. If i(u) = i(u'), for all 
u, u' G U, it is called a discretization with a fixed control sampling rate; otherwise, 
it is called a discretization with a varying control sampling rate. The range of the 
length of time intervals is denoted by a interval 

V = [inf *(u), sup *(«)]. (15.1) 

We assume that inf t{u) > and sup t{u) < oo. Depending on the problem and 

ueu neu 

discretization, U may or may not be finite. 

Classification of discretizations Based on whether the sampling rate is fixed 
and whether the control set is finite, there are four types of discretized motion 
planning problems: 

• FF: Fixed control sampling rate, a finite set of controls 

The first type has a finite set of controls. In [54, 203, 343, 463], for every 
motion planning problem at every stage, the local planner chooses a control 
in U and applies it on the system for a fixed period of time. 

A special case of this type is considered in [203, 212], in which a constant 
acceleration is applied on the system for a fixed period of time. For general 
systems, a non-constant control is necessary to maintain the constant accel- 
eration. Thus, a non-trivial local planner needs to be assumed to provide 
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the constant acceleration. In these planners, along each degree of freedom, 
there are only a finite number of accelerations at each stage, which makes 
U finite. The main difference between this case and more general systems 
is that the reachability graph in this case is always finite because the state 
space is compact, and the fixed control sampling rate is carefully chosen to 
ensure that the velocity bound is an integer multiple of the product of the 
acceleration and the sampling rate. 

This case often appears in the definition of the system; however, the control 
space is usually discretized before a motion planning algorithm is employed. 
In this paper, we also consider the case in which continuous methods are 
used to select controls, which results in the FI case. 

• VF: Varying control sampling rate, a finite set of controls 

Problems of this type were considered in [524, 544, 669, 698]. At each stage, 
a non-trivial local planner drives the system from the current state to a 
finite number of adjacent neighboring states on a grid, resulting in a finite 
control set for each state on the grid. However, each control might last for 
different amount of time and U are state dependent. 

• VI: Varying control sampling rate, an infinite set of controls 

Problems of this type were considered in [164, 261, 381]. At every stage, 
the local planner may drive the system from the current state to a possibly 
infinite number of states. Each control designed by the local planner may 
last for a different amount of time and U (x) might vary from state to state. 

15.3.1 An Incremental Search Framework 

The reachability graph Currently much of this is pasted from a recent paper 
by Peng Cheng and Steve LaValle 

The reachability graph describes the connectivity between reachable states 
from Xi n it and is fixed for a given V. It is not something that is constructed by 
an algorithm; it simply exists once V is defined. The reachability graph will serve 
in this paper as an important frame of reference for comparing the search graph 
generated by an algorithm. 

Before defining the reachability graph, we define the set of reachable states at 
stage k, denoted as Rk, by induction. First, Rq = {xinit}. At stage k: 



R k = {x | x = f(x',u), x G Rk-i, u G U v f(x')}. 



(15.2) 



The set of reachable states from x init is 




(15.3) 



k=0 
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For a given V, the reachability graph, G(N, £), in which M and S are the sets 
of nodes and directed edges of Q, respectively. Every node corresponds to a unique 
reachable state, which implies a bijection between Af and 7^oo. Every node n in 
Q is associated with a state x(n) G X, and every edge e G £ is associated with a 
control u(e). The same notation will also be used for the search graph defined in 
the next section with minor modification. An edge e G 8 from node n s to node 
n e exists if there is a control u(e) G U v f(x(n s )) such that x(n e ) = f(x(n s ),u). We 
say that Q is cyclic if it contains a directed cycle. 

If AT is finite, Q will be called finite; otherwise, Q will be called infinite. Note 
that if U is infinite, then Q might be infinite. This occurs for problems of FI and 
VI (as defined in Section ??). Intuitively, when U is finite, as in problems of FF 
and VF, it might seem that Q would finite. An interesting exception appears in 
[73], which provides conditions for IZ^, to be dense for a class of discrete-time 
chained-form systems with quantized control sets, i.e., finite or with values on 
regular meshes in M m for some positive integer m. 

General algorithm description An iterative procedure is generally defined 
in which each iteration attempts to add a new edge and corresponding trajectory 
segment to a search graph. The steps are briefly enumerated here, and then 
further explanation follows: 

1. Initialization: Let G(N, E) represent a directed search graph, for which 
the node set, N contains a node for x init and possibly other states in Xf ree , 
and the edge set, E, is empty. 

2. Select Node: Choose a node n cur G N for expansion. 

3. Generate Trajectory Segment: Use a local planning method to generate 
a trajectory from x{n cur ) to some state x new by applying some control u new . 

4. Update Search Graph: Determine whether an edge will be added to E. 
If so, then n cur will be the starting node, and one of several possibilities exist 
for the ending node: 1) the ending node is selected from a node already in 
N, 2) a new node, n new with associated state x new is added to N, or 3) n new 
is added as in the previous case, but other nodes in N may be deleted, and 
their associated edges are associated with n new . 

5. Check for Solution: Determine whether G encodes a solution to V . 

6. Return to Step 2: Iterate unless a solution has been found or some ter- 
mination condition is satisfied, in which case the algorithm reports failure. 

Initialization In a single-tree approach, such as the planner of Barraquand and 
Latombe [54], only one node, n(xi n u) exists in N. In a bidirectional approach, 
such as RRTExtExt [464], n(x goa i) may also be included in N. One could initially 
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place thousands of nodes in N, as in the case of initializing a probabilistic roadmap 
[387, ?] with uniform random samples from Xf ree . 

Select node This step is critical to the global search behavior of the algorithm; 
it is similar in some ways to the search queue prioritization in classical AI search. 
If dynamic programming is used, as in [54] , then Step 2 selects a node with untried 
controls that has the trajectory shortest distance to Xma. Other possibilities are 
depth-first, breadth-first, or A* [622]. In the case of an RRT [145, 466], a state, 
Xrand is generated at random in X, and then the nearest node (with respect to 
a metric on X) to x rand is returned. Numerous other possibilities exist based on 
other algorithms (e.g., [164, 212, 261, 343]). 

Generate trajectory segment Step 3 is implemented by a local planner, which 
may be considered as a separate component that produces a control u new G U that 
evolves the system from x cur to some state x new . 

Some local planners may attempt to reach another predetermined node, say 
n tar [261, 524, 669, 698]. We refer to these as connecting local planners. These local 
planners may either: 1) succeed in exactly reaching x(n tar ), 2) yield a trajectory 
that ends within a specified distance bound from x(n tar ), or 3) may fail to reach 
sufficiently close to x(n tar ), in which case another node must be selected in Step 
2. For the second condition, suppose that the user specifies a tolerance, q > 0, 
and requires that the local planner must achieve \\x new — x ta r\\ < Q to report 
success in reaching n ta r- If a connecting local planner is permitted to succeed 
under condition 2, then it is called approximate; otherwise, it is called exact if it 
only suceeds under condition 1. 

To be consistent with the definition of Q in Section 15.3.1 even when approx- 
imate local planners with tolerance e% are used, we model an approximate local 
planner as an exact local planner plus a sampling process. Given the initial state 
Xi and goal state x g , a state x s in the e\ neighborhood of x g is first sampled, and 
then a control is designed by the associated exact local planner to connect x« and 
x s . With this model, Q is built using only exact local planners, ensuring that 
there will be no discontinuities, and U (x) will be precisely the controls associated 
with edges in Q 

Since the algorithm must run for a countable number of iterations, the local 
planner actually uses a fixed, countable set of controls U s . Control set U s is 
generally obtained by sampling. 

As mentioned in Section ??, U for state x depends on the local planner. If an 
exact connecting local planner is used, U for state x consists of controls, which 
drive the system to reachable states from x. If we assume that the connecting local 
planner will return a unique solution given an initial and goal state, then sampling 
in U and X is equivalent; that is, for every sampled state, if it is reachable from x, 
a control in U is sampled. Therefore, dispersion of the state space sampling could 
be used to characterize the control space sampling. Note that the above sampling 
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procedure could also be applied when an approximate connecting local planner is 
used. The difference is that there exists control errors between sampled controls 
and controls in U since these controls only drive the system from initial states to 
the neighborhood of goal states. 

When a non-connecting local planner is used, the sampling in U normally has 
two steps. The first sampling is in the interval T>. Each sampled point corresponds 
to a set of controls with the same duration in U, on which the second sampling 
procedure happens. The quality with which the sampled set U s approximates U 
may be characterized by dispersion [579], which may be defined as follows for 
two sets, A and B, such that A C B and B is a subset of a normed space. The 
dispersion of A with respect to B is 



in which 1 1 • 1 1 is the norm. Therefore, the first sampling procedure is characterized 

by e t G [0, oo), which denotes dispersion of the set of duration of controls in U s 
with respect to V. The second sampling procedure is uniformly characterized by 
e u G [0, oo), that is, for every sampling set on which the second sampling procedure 
happens, dispersion of the sampled control set with respect to this sampling set is 
always e u . Control set U s could be either provided as a parameter to the algorithm 
or calculated by the algorithm given e u and e t . If U s is given, e u and e t could be 
also calculated once they are given. 

Update search graph As indicated in Step 4, there are several possible al- 
ternatives to updating the graph, depending on the particular algorithm. The 
simplest case is to simply add an edge associated with u new connecting n cur to a 
new node associated with x new to G. If the local planner exactly reaches some 
existing node, n tar , then the edge is added from n cur to n tar since x new = x(n tar ); 
otherwise, the systematic search requirement would be violated. If an approxi- 
mate connecting local planner is used, then the behavior is the same, except that 
the edge is added from n cur to n tar whenever \\x new — x(n ta r)\\ < 

An additional complication is caused by state space discretization. Some plan- 
ners try to avoid generating nodes that are too close to each other. For example, 
in the method of Barraquand and Latombe [54], the space is partitioned into a 
tiling of rectangular cells. As the algorithm runs, a new node is introduced in the 
search graph only when it lies in an unvisited cell. This ensures that each cell will 
contain no more than one node, which prevents the algorithm from generating a 
countably infinite number of states in the search graph. 

The resolution of this state space discretization and many other schemes can 
be expressed in terms of the dispersion of the search graph nodes with respect 
to Xf ree . Let €4 G [0, oo) denote a bound on the dispersion due to state space 
discretization. If u new leads to a violation-free trajectory, then the algorithm 
must add x new to N if the nearest vertex in N is at least from x new . If x new 




(15.4) 




506 



S. M. LaValle: Planning Algorithms 



is within distance e d of some other nodes in N, then one of two behaviors may 
occur, depending on the particular algorithm: 1) x new is discarded, or 2) x new is 
inserted, but all existing nodes within distance e d of x new are deleted. 

The parameter e d may be used to specify the size of the cells. For example, 
under an £°° metric, e d directly gives the maximal cell width. In addition to using 
predefined partitions, other schemes may be possible. For example, a new node 
may be inserted into the search graph only if there exist no other nodes within 
distance e d . 

Check for solution Step 5 must determine whether a solution in the sense 
defined in ?? exists within the graph. A candidate solution can be constructed 
from any connected sequence, (ei, . . . , e^) of edges (path) in G that starts with 
n(xi n it). Starting at Xi n u, the control u{ei) is applied over its specified duration, 
for each % from 1 to k. The motion equation, /, is integrated during this process, 
and it can be determined whether the resulting trajectory is violation free and 
terminates within e s of x goa i. 

The determination of which sequences to check for solutions depends on the 
particular algorithm. In many cases, the number of new candidate solutions that 
appear in one iteration may be small. In this case, all of them could be checked. 
In other algorithms, heuristics may be used to prune the consideration of too 
many candidate solutions. For example, a connection tolerance could be given and 
solution checking happens only when the distance between states in two subgraphs 
is less than the tolerance. For the purposes of resolution completeness analysis, 
we assume that no such pruning is performed. 

15.3.2 Tree-Based Dynamic Programming 

The forward dynamic programming (FDP) method is similar to an RRT in that it 
grows a tree from Xj„j t . The key difference is that FDP uses dynamic programming 
to decide how to incrementally expand the tree, as opposed to nearest-neighbors of 
random samples. FDP performs a systematic exploration over fine-resolution grid 
that is placed over the state space. This limits is applicability to low-dimensional 
state spaces (up to 3 or 4 dimensions). 

The configuration space, X, is divided into a rectangular grid (typically there 
are a hundred grid points per axis). Each element of the grid is called a cell, which 
designates a rectangular subset of X. One of three different labels can be applied 
to each cell: 

• OBST: The cell contains points in X b s . 

• FREE: The cell has not yet been visited by the algorithm, and it lies entirely 
in Xf ree . 



• VISITED: The cell has been visited, and it lies entirely in Xf, 
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Initially, all cells are labeled either FREE or OBST by using an collision detection 
algorithm. 

Let Q represent a priority queue in which the elements are configurations, 
sorted in increasing order according to L, which represents the cost accumulated 
along the path constructed so far from x in # to x. This cost can be assigned in 
many different ways. It could simply represent the time (number of At steps), or 
could count the number of times a car changes directions. 

The algorithm proceeds as follows: 



FORWARD _DYNAMIC_PROGRAMMING(xj n i t , x goa i) 

1 Q.insert(x init ,L); 

2 G.imt(x inU ); 

3 while Q ^ and FREE(x goa i) 



^ %cur -> Q.pop(); 

5 for each x G NBHD(:r COT .) 

6 if FREE(a;) 

7 Q.insert(x, L); 

8 Cadd_vertex(x); 

9 G.add_edge(x CMr , x); 

10 Label cell that contains x as VISITED; 



11 Return G; 

The algorithm iterative grows a tree, G, which it rooted at x ini t. The NHBD 
function tries the possible inputs, and returns a set of configurations that can be 
reached in time At. For each of these configurations, if the cell that contains it 
is FREE, then G is extended. At any given time, there is at most one vertex 
per cell. The algorithm terminates when the cell that contains the goal has been 
reached. 

15.3.3 RDT-Based Methods 

This is very incomplete... 

The RRT planning method can be easily adapted to the case of nonholonomic 
planning. All references to configurations are replaced by references to states; this 
is merely a change of names. The only important difference between holonomic 
planning and nonholonomic planning with an RRT occurs in the EXTEND pro- 
cedure. For holonomic planning, the function NEW_CONFIG generated a config- 
uration that lies on the line segment that connects q to q nea r- For nonholonomic 
planning, motions must be generated by applying inputs. The NEW.CONFIG 
function is replaced by NEW .STATE, which attempts to apply all of the inputs 
in U, and selects the input that generates an x new that is closest to x near with 
respect to the metric p. If U is infinite, then it can be approximated with a finite 
set of inputs. 
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15.3.4 Other Sampling-Based Methods 

Hsu, Kindel, Latombe, Rock 

Sampling-based roadmap approaches 

15.4 Gradient-Based Optimization Techniques 

The importance of gap reduction 

15.5 Optimal Feedback Strategies 

15.5.1 Problem Definition 

15.5.2 Exact Solutions for Linear Systems 

15.5.3 Functional Dynamic Programming 
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